Re: [PATCH V4] mlx4_core: allocate ICM memory in page size chunks

2018-05-29 Thread Eric Dumazet
On 05/25/2018 10:23 AM, David Miller wrote: > From: Qing Huang > Date: Wed, 23 May 2018 16:22:46 -0700 > >> When a system is under memory presure (high usage with fragments), >> the original 256KB ICM chunk allocations will likely trigger kernel >> memory management to enter slow path doing me

Re: [PATCH v2 net-next] tcp: use data length instead of skb->len in tcp_probe

2018-05-29 Thread Eric Dumazet
On 05/29/2018 07:36 AM, Yafang Shao wrote: > On Tue, May 29, 2018 at 10:15 PM, David Miller wrote: >> From: Yafang Shao >> Date: Fri, 25 May 2018 18:14:05 +0800 >> >>> skb->len is meaningless to user. >>> data length could be more helpful, with which we can easily filter out >>> the packet wit

Re: [PATCH v3 net-next 2/2] tcp: minor optimization around tcp_hdr() usage in tcp receive path

2018-05-28 Thread Eric Dumazet
On 05/28/2018 05:41 PM, Yafang Shao wrote: > OK. > > And what about introducing a new helper tcp_hdr_fast() ? > > /* use it when tcp header has not been pulled yet */ > static inline struct tcphdr *tcp_hdr_fast(const struct sk_buff *skb) > > { > > return (const struct tcphdr *)skb->

Re: [PATCH v3 net-next 2/2] tcp: minor optimization around tcp_hdr() usage in tcp receive path

2018-05-28 Thread Eric Dumazet
On Mon, May 28, 2018 at 8:36 AM Yafang Shao wrote: > This is additional to the commit ea1627c20c34 ("tcp: minor optimizations around tcp_hdr() usage"). > At this point, skb->data is same with tcp_hdr() as tcp header has not > been pulled yet. > Cc: Eric Dumazet >

Re: INFO: rcu detected stall in corrupted

2018-05-21 Thread Eric Dumazet
On 05/21/2018 11:09 AM, David Miller wrote: > From: syzbot > Date: Mon, 21 May 2018 11:05:02 -0700 > >> find_match+0x244/0x13a0 net/ipv6/route.c:691 >> find_rr_leaf net/ipv6/route.c:729 [inline] >> rt6_select net/ipv6/route.c:779 [inline] > > Hmmm, endless loop in find_rr_leaf or similar? >

Re: [PATCH] bpf: check NULL for sk_to_full_sk()

2018-05-21 Thread Eric Dumazet
On 05/21/2018 12:55 AM, YueHaibing wrote: > like commit df39a9f106d5 ("bpf: check NULL for sk_to_full_sk() return value"), > we should check sk_to_full_sk return value against NULL. > > Signed-off-by: YueHaibing > --- > include/linux/bpf-cgroup.h | 2 +- > 1 file changed, 1 insertion(+), 1 del

Re: INFO: rcu detected stall in is_bpf_text_address

2018-05-19 Thread Eric Dumazet
SCTP experts, please take a look. On 05/19/2018 08:55 AM, syzbot wrote: > Hello, > > syzbot found the following crash on: > > HEAD commit:    73fcb1a370c7 Merge branch 'akpm' (patches from Andrew) > git tree:   upstream > console output: https://syzkaller.appspot.com/x/log.txt?x=1462ec0f8000

Re: WARNING in ip_recv_error

2018-05-18 Thread Eric Dumazet
On 05/18/2018 05:08 AM, DaeRyong Jeong wrote: > We report the crash: WARNING in ip_recv_error > (I resend the email since I mistakenly missed the subject in my previous > email. I'm sorry.) > > > This crash has been found in v4.17-rc1 using RaceFuzzer (a modified > version of Syzkaller), which

Re: [PATCH v3] mlx4_core: allocate ICM memory in page size chunks

2018-05-17 Thread Eric Dumazet
On 05/17/2018 01:53 PM, Qing Huang wrote: > When a system is under memory presure (high usage with fragments), > the original 256KB ICM chunk allocations will likely trigger kernel > memory management to enter slow path doing memory compact/migration > ops in order to complete high order memory a

Re: [PATCH V2] mlx4_core: allocate ICM memory in page size chunks

2018-05-15 Thread Eric Dumazet
On 05/15/2018 11:53 AM, Qing Huang wrote: > >> This is control path so it is less latency-sensitive. >> Let's not produce unnecessary degradation here, please call kvzalloc so we >> maintain a similar behavior when contiguous memory is available, and a >> fallback for resiliency. > > No sure

Re: [PATCH v2] {net, IB}/mlx5: Use 'kvfree()' for memory allocated by 'kvzalloc()'

2018-05-13 Thread Eric Dumazet
On 05/13/2018 12:00 AM, Christophe JAILLET wrote: > When 'kvzalloc()' is used to allocate memory, 'kvfree()' must be used to > free it. > > Signed-off-by: Christophe JAILLET > --- > v1 -> v2: More places to update have been added to the patch Please add relevant Fixes: tag(s)

Re: INFO: rcu detected stall in kfree_skbmem

2018-05-11 Thread Eric Dumazet
On 05/11/2018 11:41 AM, Marcelo Ricardo Leitner wrote: > But calling ip6_xmit with rcu_read_lock is expected. tcp stack also > does it. > Thus I think this is more of an issue with IPv6 stack. If a host has > an extensive ip6tables ruleset, it probably generates this more > easily. > >>> sctp_

Re: [PATCH 14/32] net/tcp: convert to ->poll_mask

2018-05-11 Thread Eric Dumazet
On 05/11/2018 04:07 AM, Christoph Hellwig wrote: > Signed-off-by: Christoph Hellwig > --- > include/net/tcp.h | 4 ++-- > net/ipv4/af_inet.c | 3 ++- > net/ipv4/tcp.c | 31 ++- > net/ipv6/af_inet6.c | 3 ++- > 4 files changed, 20 insertions(+), 21 deletion

Re: [PATCH net-next v2] tcp: Add mark for TIMEWAIT sockets

2018-05-10 Thread Eric Dumazet
On 05/09/2018 11:53 PM, Jon Maxwell wrote: > This version has some suggestions by Eric Dumazet: > > - Use a local variable for the mark in IPv6 instead of ctl_sk to avoid SMP > races. > - Use the more elegant "IP4_REPLY_MARK(net, skb->mark) ?: sk->sk_mark" >

Re: [PATCH net-next v1] tcp: Add mark for TIMEWAIT sockets

2018-05-09 Thread Eric Dumazet
On 05/09/2018 10:21 PM, Jon Maxwell wrote: ... > if (th->rst) > @@ -723,11 +724,17 @@ static void tcp_v4_send_reset(const struct sock *sk, > struct sk_buff *skb) > arg.tos = ip_hdr(skb)->tos; > arg.uid = sock_net_uid(net, sk && sk_fullsock(sk) ? sk : NULL); > local_bh_d

Re: [PATCH net-next] tcp: Add mark for TIMEWAIT sockets

2018-05-09 Thread Eric Dumazet
On 05/09/2018 07:07 PM, Jon Maxwell wrote: > Aidan McGurn from Openwave Mobility systems reported the following bug: > > "Marked routing is broken on customer deployment. Its effects are large > increase in Uplink retransmissions caused by the client never receiving > the final ACK to their FI

Re: KASAN: use-after-free Read in __dev_queue_xmit

2018-05-09 Thread Eric Dumazet
On 05/09/2018 12:21 PM, Willem de Bruijn wrote: > Indeed. The skb shared info struct is zeroed by dev_validate_header > as a result of dev->hard_header_len exceeding skb->end - skb->data. > > Not exactly sure yet how this can happen. The hard header length space > is accounted for during alloca

Re: BUG: spinlock bad magic in tun_do_read

2018-05-07 Thread Eric Dumazet
On 05/07/2018 10:54 PM, Cong Wang wrote: > On Mon, May 7, 2018 at 10:27 PM, syzbot > wrote: >> Hello, >> >> syzbot found the following crash on: >> >> HEAD commit:75bc37fefc44 Linux 4.17-rc4 >> git tree: upstream >> console output: https://syzkaller.appspot.com/x/log.txt?x=1162c6978000

Re: [PATCH] net: 8390: Fix possible data races in __ei_get_stats

2018-05-07 Thread Eric Dumazet
On 05/07/2018 07:16 PM, Jia-Ju Bai wrote: > Yes, "&dev->stats" will not change, because it is a fixed address. > But the field data in "dev->stats" is changed (rx_frame_errors, rx_crc_errors > and rx_missed_errors). > So if the driver returns "&dev->stats" without lock protection (like on line

Re: [PATCH] net: 8390: Fix possible data races in __ei_get_stats

2018-05-07 Thread Eric Dumazet
On 05/07/2018 05:51 PM, Jia-Ju Bai wrote: > > > On 2018/5/7 22:15, Eric Dumazet wrote: >> >> On 05/07/2018 07:08 AM, Jia-Ju Bai wrote: >>> The write operations to "dev->stats" are protected by >>> the spinlock on line 862-864, but the read

Re: [PATCH] net: 8390: Fix possible data races in __ei_get_stats

2018-05-07 Thread Eric Dumazet
On 05/07/2018 07:08 AM, Jia-Ju Bai wrote: > The write operations to "dev->stats" are protected by > the spinlock on line 862-864, but the read operations to > this data on line 858 and 867 are not protected by the spinlock. > Thus, there may exist data races for "dev->stats". > > To fix the dat

Re: WARNING in kernfs_add_one

2018-05-05 Thread Eric Dumazet
On 05/05/2018 09:40 AM, Greg KH wrote: > On Sat, May 05, 2018 at 08:47:02AM -0700, syzbot wrote: >> Hello, >> >> syzbot found the following crash on: >> >> HEAD commit:8fb11a9a8d51 net/ipv6: rename rt6_next to fib6_next >> git tree: net-next >> console output: https://syzkaller.appspot.

Re: [PATCH] net: disable UDP punt on sockets in RCV_SHUTDWON

2018-05-04 Thread Eric Dumazet
On 05/04/2018 02:08 PM, Chintan Shah wrote: > A UDP application which opens multiple sockets with same local > address/port combination (using SO_REUSEPORT/SO_REUSEADDR socket options); > and issues connect to a remote socket (using one of these local socket). > Now if the same socket, which issu

[PATCH v4 net-next 2/2] selftests: net: tcp_mmap must use TCP_ZEROCOPY_RECEIVE

2018-04-27 Thread Eric Dumazet
number of bytes that should be read using conventional read()/recv()/recvmsg() system calls, to skip a sequence of bytes that can not be mapped, because not properly page aligned. Signed-off-by: Eric Dumazet Cc: Andy Lutomirski Acked-by: Soheil Hassas Yeganeh --- tools/testing/selftests/net

[PATCH v4 net-next 0/2] tcp: mmap: rework zerocopy receive

2018-04-27 Thread Eric Dumazet
int in case user request was completed. Eric Dumazet (2): tcp: add TCP_ZEROCOPY_RECEIVE support for zerocopy receive selftests: net: tcp_mmap must use TCP_ZEROCOPY_RECEIVE include/uapi/linux/tcp.h | 8 + net/ipv4/af_inet.c | 2 + net/ipv4

[PATCH v4 net-next 1/2] tcp: add TCP_ZEROCOPY_RECEIVE support for zerocopy receive

2018-04-27 Thread Eric Dumazet
f mmap() and setsockopt(... TCP_ZEROCOPY_RECEIVE ...) Note that memcg might require additional changes. Fixes: 93ab6cc69162 ("tcp: implement mmap() for zero copy receive") Signed-off-by: Eric Dumazet Reported-by: syzbot Suggested-by: Andy Lutomirski Cc: linux...@kvack.org Acked-by:

Re: [PATCH v2 net-next 1/2] tcp: add TCP_ZEROCOPY_RECEIVE support for zerocopy receive

2018-04-27 Thread Eric Dumazet
On Fri, Apr 27, 2018 at 1:45 AM kbuild test robot wrote: > Hi Eric, > Thank you for the patch! Yet something to improve: > [auto build test ERROR on net-next/master] > url: https://github.com/0day-ci/linux/commits/Eric-Dumazet/tcp-add-TCP_ZEROCOPY_RECEIVE-support-for-zerocopy-rece

Re: [PATCH v2 net-next 0/2] tcp: mmap: rework zerocopy receive

2018-04-26 Thread Eric Dumazet
On 04/26/2018 02:16 PM, Andy Lutomirski wrote: > At the risk of further muddying the waters, there's another minor tweak > that could improve performance on certain workloads. Currently you mmap() > a range for a given socket and then getsockopt() to receive. If you made > it so you could mmap(

Re: [PATCH v2 net-next 0/2] tcp: mmap: rework zerocopy receive

2018-04-26 Thread Eric Dumazet
On 04/25/2018 06:20 PM, Soheil Hassas Yeganeh wrote: > > Acked-by: Soheil Hassas Yeganeh > > Thanks Soheil for reviewing. I have changed setsockopt() to getsockopt() so chose to not carry your Acked-by Please add it back if you agree, thanks !

[PATCH v3 net-next 1/2] tcp: add TCP_ZEROCOPY_RECEIVE support for zerocopy receive

2018-04-26 Thread Eric Dumazet
f mmap() and setsockopt(... TCP_ZEROCOPY_RECEIVE ...) Note that memcg might require additional changes. Fixes: 93ab6cc69162 ("tcp: implement mmap() for zero copy receive") Signed-off-by: Eric Dumazet Reported-by: syzbot Suggested-by: Andy Lutomirski Cc: linux...@kvack.org Cc:

[PATCH v3 net-next 2/2] selftests: net: tcp_mmap must use TCP_ZEROCOPY_RECEIVE

2018-04-26 Thread Eric Dumazet
number of bytes that should be read using conventional read()/recv()/recvmsg() system calls, to skip a sequence of bytes that can not be mapped, because not properly page aligned. Signed-off-by: Eric Dumazet Cc: Andy Lutomirski Cc: Soheil Hassas Yeganeh --- tools/testing/selftests/net/tcp_mmap.c

[PATCH v3 net-next 0/2] tcp: mmap: rework zerocopy receive

2018-04-26 Thread Eric Dumazet
. v3: change TCP_ZEROCOPY_RECEIVE to be a getsockopt() option instead of setsockopt(), feedback from Ka-Cheon Poon v2: Added a missing page align of zc->length in tcp_zerocopy_receive() Properly clear zc->recv_skip_hint in case user request was completed. Eric Dumazet (2): tc

Re: [PATCH v2 net-next 1/2] tcp: add TCP_ZEROCOPY_RECEIVE support for zerocopy receive

2018-04-26 Thread Eric Dumazet
On 04/26/2018 06:40 AM, Ka-Cheong Poon wrote: > A quick question.  Is it a normal practice to return a result > in setsockopt() given that the optval parameter is supposed to > be a const void *? Very good question. Andy suggested an ioctl() or setsockopt(), and I chose setsockopt() but it loo

[PATCH v2 net-next 1/2] tcp: add TCP_ZEROCOPY_RECEIVE support for zerocopy receive

2018-04-25 Thread Eric Dumazet
f mmap() and setsockopt(... TCP_ZEROCOPY_RECEIVE ...) Note that memcg might require additional changes. Fixes: 93ab6cc69162 ("tcp: implement mmap() for zero copy receive") Signed-off-by: Eric Dumazet Reported-by: syzbot Suggested-by: Andy Lutomirski Cc: linux...@kvack.org Cc:

[PATCH v2 net-next 2/2] selftests: net: tcp_mmap must use TCP_ZEROCOPY_RECEIVE

2018-04-25 Thread Eric Dumazet
number of bytes that should be read using conventional read()/recv()/recvmsg() system calls, to skip a sequence of bytes that can not be mapped, because not properly page aligned. Signed-off-by: Eric Dumazet Cc: Andy Lutomirski Cc: Soheil Hassas Yeganeh --- tools/testing/selftests/net/tcp_mmap.c

[PATCH v2 net-next 0/2] tcp: mmap: rework zerocopy receive

2018-04-25 Thread Eric Dumazet
. v2: Added a missing page align of zc->length in tcp_zerocopy_receive() Properly clear zc->recv_skip_hint in case user request was completed. Eric Dumazet (2): tcp: add TCP_ZEROCOPY_RECEIVE support for zerocopy receive selftests: net: tcp_mmap must use TCP_ZEROCOPY_RECEIVE include/uapi

Re: [PATCH net-next 1/2] tcp: add TCP_ZEROCOPY_RECEIVE support for zerocopy receive

2018-04-25 Thread Eric Dumazet
On 04/25/2018 09:35 AM, Eric Dumazet wrote: > > > On 04/25/2018 09:22 AM, Andy Lutomirski wrote: > >> In general, I suspect that the zerocopy receive mechanism will only >> really be a win in single-threaded applications that consume large >> amounts of rec

Re: [PATCH net-next 1/2] tcp: add TCP_ZEROCOPY_RECEIVE support for zerocopy receive

2018-04-25 Thread Eric Dumazet
On 04/25/2018 09:22 AM, Andy Lutomirski wrote: > In general, I suspect that the zerocopy receive mechanism will only > really be a win in single-threaded applications that consume large > amounts of receive bandwidth on a single TCP socket using lots of > memory and don't do all that much else.

Re: [PATCH net-next 1/2] tcp: add TCP_ZEROCOPY_RECEIVE support for zerocopy receive

2018-04-25 Thread Eric Dumazet
On 04/25/2018 09:04 AM, Matthew Wilcox wrote: > If you don't zap the page range, any of the CPUs in the system where > any thread in this task have ever run may have a TLB entry pointing to > this page ... if the page is being recycled into the page allocator, > then that page might end up as a

Re: [PATCH net-next 1/2] tcp: add TCP_ZEROCOPY_RECEIVE support for zerocopy receive

2018-04-25 Thread Eric Dumazet
On 04/24/2018 10:27 PM, Eric Dumazet wrote: > When adding tcp mmap() implementation, I forgot that socket lock > had to be taken before current->mm->mmap_sem. syzbot eventually caught > the bug. > + ... > + down_read(¤t->mm->mmap_sem); > + > + ret

Re: [PATCH net-next 1/2] tcp: add TCP_ZEROCOPY_RECEIVE support for zerocopy receive

2018-04-25 Thread Eric Dumazet
On 04/24/2018 11:28 PM, Christoph Hellwig wrote: > On Tue, Apr 24, 2018 at 10:27:21PM -0700, Eric Dumazet wrote: >> When adding tcp mmap() implementation, I forgot that socket lock >> had to be taken before current->mm->mmap_sem. syzbot eventually caught >> the bug.

[PATCH net-next 1/2] tcp: add TCP_ZEROCOPY_RECEIVE support for zerocopy receive

2018-04-24 Thread Eric Dumazet
f mmap() and setsockopt(... TCP_ZEROCOPY_RECEIVE ...) Note that memcg might require additional changes. Fixes: 93ab6cc69162 ("tcp: implement mmap() for zero copy receive") Signed-off-by: Eric Dumazet Reported-by: syzbot Suggested-by: Andy Lutomirski Cc: linux...@kvack.org Cc:

[PATCH net-next 0/2] tcp: mmap: rework zerocopy receive

2018-04-24 Thread Eric Dumazet
. Eric Dumazet (2): tcp: add TCP_ZEROCOPY_RECEIVE support for zerocopy receive selftests: net: tcp_mmap must use TCP_ZEROCOPY_RECEIVE include/uapi/linux/tcp.h | 8 ++ net/ipv4/tcp.c | 186 + tools/testing/selftests/net/tcp_mmap.c

[PATCH net-next 2/2] selftests: net: tcp_mmap must use TCP_ZEROCOPY_RECEIVE

2018-04-24 Thread Eric Dumazet
number of bytes that should be read using conventional read()/recv()/recvmsg() system calls, to skip a sequence of bytes that can not be mapped, because not properly page aligned. Signed-off-by: Eric Dumazet Cc: Andy Lutomirski Cc: Soheil Hassas Yeganeh --- tools/testing/selftests/net/tcp_mmap.c

Re: [PATCH net-next] net: init sk_cookie for inet socket

2018-04-24 Thread Eric Dumazet
On 04/24/2018 04:47 AM, Yafang Shao wrote: > > Could you pls. explain the issue to me ? Just run a synflood test on your host, it will definitely show the atomic consuming most cpu cycles in inet_reqsk_alloc(), because of huge contention on a cache line shared by all cpus. Performance is redu

Re: [PATCH net-next] net: init sk_cookie for inet socket

2018-04-24 Thread Eric Dumazet
On 04/23/2018 09:39 PM, Yafang Shao wrote: > On Tue, Apr 24, 2018 at 12:09 AM, Eric Dumazet wrote: >> >> >> On 04/23/2018 08:58 AM, David Miller wrote: >>> From: Yafang Shao >>> Date: Sun, 22 Apr 2018 21:50:04 +0800 >>> >>>> With

Re: [PATCH net-next 0/4] mm,tcp: provide mmap_hook to solve lockdep issue

2018-04-23 Thread Eric Dumazet
On 04/23/2018 07:04 PM, Andy Lutomirski wrote: > On Mon, Apr 23, 2018 at 2:38 PM, Eric Dumazet wrote: >> Hi Andy >> >> On 04/23/2018 02:14 PM, Andy Lutomirski wrote: > >>> I would suggest that you rework the interface a bit. First a user would >>>

Re: [PATCH net-next 0/4] mm,tcp: provide mmap_hook to solve lockdep issue

2018-04-23 Thread Eric Dumazet
Hi Andy On 04/23/2018 02:14 PM, Andy Lutomirski wrote: > On 04/20/2018 08:55 AM, Eric Dumazet wrote: >> This patch series provide a new mmap_hook to fs willing to grab >> a mutex before mm->mmap_sem is taken, to ensure lockdep sanity. >> >> This hook allows us to sho

Re: [PATCH net-next] net: init sk_cookie for inet socket

2018-04-23 Thread Eric Dumazet
On 04/23/2018 08:58 AM, David Miller wrote: > From: Yafang Shao > Date: Sun, 22 Apr 2018 21:50:04 +0800 > >> With sk_cookie we can identify a socket, that is very helpful for >> traceing and statistic, i.e. tcp tracepiont and ebpf. >> So we'd better init it by default for inet socket. >> When u

Re: WARNING: suspicious RCU usage in rt6_check_expired

2018-04-23 Thread Eric Dumazet
On 04/23/2018 01:24 AM, syzbot wrote: > Hello, > > syzbot hit the following crash on net-next commit > 0638eb573cde5888c0886c7f35da604e5db209a6 (Sat Apr 21 20:06:14 2018 +) > Merge branch 'ipv6-Another-followup-to-the-fib6_info-change' > syzbot dashboard link: > https://syzkaller.appspot.co

Re: [PATCH tip/core/rcu 07/22] softirq: Eliminate unused cond_resched_softirq() macro

2018-04-23 Thread Eric Dumazet
ed-off-by: Paul E. McKenney > > Cc: Ingo Molnar > Fair enough, > Acked-by: Peter Zijlstra (Intel) Yes, I suggested this removal in https://www.spinics.net/lists/netdev/msg375161.html Reviewed-by: Eric Dumazet Thanks Paul.

Re: [PATCH net-next 0/4] mm,tcp: provide mmap_hook to solve lockdep issue

2018-04-21 Thread Eric Dumazet
On 04/21/2018 02:07 AM, Christoph Hellwig wrote: > On Fri, Apr 20, 2018 at 08:55:38AM -0700, Eric Dumazet wrote: >> This patch series provide a new mmap_hook to fs willing to grab >> a mutex before mm->mmap_sem is taken, to ensure lockdep sanity. >> >> This hook

[PATCH net-next 4/4] tcp: mmap: move the skb cleanup to tcp_mmap_hook()

2018-04-20 Thread Eric Dumazet
s to perform mm operations without delay. Note that the preparation work (building the array of page pointers) can also be done from tcp_mmap_hook() while mmap_sem has not been taken yet, but this is another independent change. Signed-off-by: Eric Dumazet --- net/ipv4/tcp.c | 20 +++-

[PATCH net-next 2/4] net: implement sock_mmap_hook()

2018-04-20 Thread Eric Dumazet
sock_mmap_hook() is the mmap_hook handler provided for socket_file_ops Following patch will provide tcp_mmap_hook() for TCP protocol. Signed-off-by: Eric Dumazet --- include/linux/net.h | 1 + net/socket.c| 9 + 2 files changed, 10 insertions(+) diff --git a/include/linux

[PATCH net-next 1/4] mm: provide a mmap_hook infrastructure

2018-04-20 Thread Eric Dumazet
in multi-threading programs. Fixes: 93ab6cc69162 ("tcp: implement mmap() for zero copy receive") Signed-off-by: Eric Dumazet Reported-by: syzbot --- include/linux/fs.h | 6 ++ mm/util.c | 19 ++- 2 files changed, 24 insertions(+), 1 deletion(-) diff --git a/i

[PATCH net-next 0/4] mm,tcp: provide mmap_hook to solve lockdep issue

2018-04-20 Thread Eric Dumazet
This patch series provide a new mmap_hook to fs willing to grab a mutex before mm->mmap_sem is taken, to ensure lockdep sanity. This hook allows us to shorten tcp_mmap() execution time (while mmap_sem is held), and improve multi-threading scalability. Eric Dumazet (4): mm: provide a mmap_h

[PATCH net-next 3/4] tcp: provide tcp_mmap_hook()

2018-04-20 Thread Eric Dumazet
tcp_mmap() execution time and thus increase mmap() performance in multi-threaded programs. Fixes: 93ab6cc69162 ("tcp: implement mmap() for zero copy receive") Signed-off-by: Eric Dumazet Reported-by: syzbot --- include/net/tcp.h | 1 + net/ipv4/af_inet.c | 1 + net/i

Re: [PATCH net] tcp: don't read out-of-bounds opsize

2018-04-20 Thread Eric Dumazet
On 04/20/2018 06:57 AM, Jann Horn wrote: > The old code reads the "opsize" variable from out-of-bounds memory (first > byte behind the segment) if a broken TCP segment ends directly after an > opcode that is neither EOL nor NOP. > > The result of the read isn't used for anything, so the worst th

Re: [PATCH] kvmalloc: always use vmalloc if CONFIG_DEBUG_VM

2018-04-19 Thread Eric Dumazet
On 04/19/2018 09:12 AM, Mikulas Patocka wrote: > > > These bugs are hard to reproduce because vmalloc falls back to kmalloc > only if memory is fragmented. > This sentence is wrong. because kvmalloc() falls back to vmalloc() ...

Re: WARNING: suspicious RCU usage in fib6_info_alloc

2018-04-18 Thread Eric Dumazet
On 04/18/2018 02:04 PM, David Ahern wrote: > On 4/18/18 3:02 PM, syzbot wrote: >> stack backtrace: >> CPU: 1 PID: 25 Comm: kworker/1:1 Not tainted 4.16.0+ #5 >> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS >> Google 01/01/2011 >> Workqueue: ipv6_addrconf addrconf_dad_wo

Re: [PATCH] net: don't use kvzalloc for DMA memory

2018-04-18 Thread Eric Dumazet
On 04/18/2018 10:55 AM, Michael S. Tsirkin wrote: > Imagine you want to pass some data to card. > Natural thing is to just put it in a variable and start DMA. > However DMA API disallows stack access nowdays, > so it's natural to put this within struct device. > > See e.g. > > commit a72

Re: [PATCH] net: don't use kvzalloc for DMA memory

2018-04-18 Thread Eric Dumazet
On 04/18/2018 09:44 AM, Mikulas Patocka wrote: > > > On Wed, 18 Apr 2018, Eric Dumazet wrote: > >> >> >> On 04/18/2018 07:34 AM, Mikulas Patocka wrote: >>> The patch 74d332c13b21 changes alloc_netdev_mqs to use vzalloc if kzalloc >>> fails (l

Re: [PATCH] net: don't use kvzalloc for DMA memory

2018-04-18 Thread Eric Dumazet
On 04/18/2018 07:34 AM, Mikulas Patocka wrote: > The patch 74d332c13b21 changes alloc_netdev_mqs to use vzalloc if kzalloc > fails (later patches change it to kvzalloc). > > The problem with this is that if the vzalloc function is actually used, > virtio_net doesn't work (because it expects tha

Re: [PATCH v2 net-next] net: introduce a new tracepoint for tcp_rcv_space_adjust

2018-04-17 Thread Eric Dumazet
On 04/17/2018 09:36 AM, Yafang Shao wrote: > tcp_rcv_space_adjust is called every time data is copied to user space, > introducing a tcp tracepoint for which could show us when the packet is > copied to user. > This could help us figure out whether there's latency in user process. > > When a tcp

Re: [PATCH net-next] net: introduce a new tracepoint for tcp_rcv_space_adjust

2018-04-16 Thread Eric Dumazet
On 04/16/2018 08:33 AM, Yafang Shao wrote: > tcp_rcv_space_adjust is called every time data is copied to user space, > introducing a tcp tracepoint for which could show us when the packet is > copied to user. > This could help us figure out whether there's latency in user process. > > When a tcp

Re: instant reboot caused by 194a9749c73d650c0

2018-04-16 Thread Eric Dumazet
On 04/16/2018 02:15 AM, Kirill A. Shutemov wrote: > On Mon, Apr 16, 2018 at 08:07:09AM +0200, Ingo Molnar wrote: >> >> * Eric Dumazet wrote: >> >>> Hi Kirill >>> >>> For some reason, my hosts instantly crash at boot time, with absolutely no

instant reboot caused by 194a9749c73d650c0

2018-04-14 Thread Eric Dumazet
Hi Kirill For some reason, my hosts instantly crash at boot time, with absolutely no log on console. Bisection pointed to : $ git bisect bad 194a9749c73d650c0b1dfdee04fb0bdf0a888ba8 is the first bad commit commit 194a9749c73d650c0b1dfdee04fb0bdf0a888ba8 Author: Kirill A. Shutemov Date: Mon M

Re: [PATCH 1/2] slab: __GFP_ZERO is incompatible with a constructor

2018-04-10 Thread Eric Dumazet
On 04/10/2018 05:53 AM, Matthew Wilcox wrote: > From: Matthew Wilcox > > __GFP_ZERO requests that the object be initialised to all-zeroes, > while the purpose of a constructor is to initialise an object to a > particular pattern. We cannot do both. Add a warning to catch any > users who mista

Re: [PATCH] net: decnet: Replace GFP_ATOMIC with GFP_KERNEL in dn_route_init

2018-04-09 Thread Eric Dumazet
On 04/09/2018 07:10 AM, Jia-Ju Bai wrote: > dn_route_init() is never called in atomic context. > > The call chain ending up at dn_route_init() is: > [1] dn_route_init() <- decnet_init() > decnet_init() is only set as a parameter of module_init(). > > Despite never getting called from atomic con

Re: [PATCH] net: dccp: Replace GFP_ATOMIC with GFP_KERNEL in dccp_init

2018-04-09 Thread Eric Dumazet
On 04/09/2018 07:10 AM, Jia-Ju Bai wrote: > dccp_init() is never called in atomic context. > This function is only set as a parameter of module_init(). > > Despite never getting called from atomic context, > dccp_init() calls __get_free_pages() with GFP_ATOMIC, > which waits busily for allocatio

Re: WARNING in ip_rt_bug

2018-04-09 Thread Eric Dumazet
On 04/08/2018 11:06 PM, Dmitry Vyukov wrote: > On Mon, Apr 9, 2018 at 7:59 AM, syzbot > wrote: >> Hello, >> >> syzbot hit the following crash on net-next commit >> 8bde261e535257e81087d39ff808414e2f5aa39d (Sun Apr 1 02:31:43 2018 +) >> Merge tag 'mlx5-updates-2018-03-30' of >> git://git.kern

Re: [PATCH v2] net: thunderx: nicvf_main: Fix potential NULL pointer dereferences

2018-04-03 Thread Eric Dumazet
: add ndo_set_rx_mode callback > implementation for VF") > Signed-off-by: Gustavo A. R. Silva > --- > Changes in v2: > - Add a null check on a second kmalloc a few lines below. Thanks to >Eric Dumazet for pointing this out. > > drivers/net/ethernet/cavium/thunder/nicvf_main.c

Re: [PATCH] net: thunderx: nicvf_main: Fix potential NULL pointer dereference

2018-04-03 Thread Eric Dumazet
On 04/03/2018 02:29 PM, Gustavo A. R. Silva wrote: > Add null check on kmalloc() return value in order to prevent > a null pointer dereference. > > Addresses-Coverity-ID: 1467429 ("Dereference null return value") > Fixes: 37c3347eb247 ("net: thunderx: add ndo_set_rx_mode callback > implementati

Re: [PATCH] net: improve ipv4 performances

2018-04-01 Thread Eric Dumazet
On 04/01/2018 11:31 AM, Anton Gary Ceph wrote: > As the Linux networking stack is growing, more and more protocols are > added, increasing the complexity of stack itself. > Modern processors, contrary to common belief, are very bad in branch > prediction, so it's our task to give hints to the com

Re: [PATCH v2 1/1] xen-netback: process malformed sk_buff correctly to avoid BUG_ON()

2018-03-28 Thread Eric Dumazet
On 03/28/2018 08:51 PM, Dongli Zhang wrote: > The "BUG_ON(!frag_iter)" in function xenvif_rx_next_chunk() is triggered if > the received sk_buff is malformed, that is, when the sk_buff has pattern > (skb->data_len && !skb_shinfo(skb)->nr_frags). Below is a sample call > stack: > >... > > The

Re: [PATCH] x86, msr: fix rdmsrl_safe_on_cpu()

2018-03-28 Thread Eric Dumazet
On 03/28/2018 03:08 AM, Borislav Petkov wrote: > I guess now that the rdmsr* side does this, you probably should convert > the wrmsr* side as well. Yes indeed, thanks for the reminder.

Re: net_tx_action race condition?

2018-03-28 Thread Eric Dumazet
On 03/28/2018 12:30 AM, Saurabh Kr wrote: > Hi Eric/Angelo, >   > We are seeing the assertion error  in linux kernel 2.4.29  “*kernel: KERNEL: > assertion (atomic_read(&skb->users) == 0) failed at dev.c(1397)**”.* Based on > patch provided (_https://patchwork.kernel.org/patch/5368051/_ ) we mer

[tip:x86/cleanups] x86/msr: Make rdmsrl_safe_on_cpu() scheduling safe as well

2018-03-28 Thread tip-bot for Eric Dumazet
Commit-ID: 9b9a51354cae933f5640b5bb73bbcd32f989122f Gitweb: https://git.kernel.org/tip/9b9a51354cae933f5640b5bb73bbcd32f989122f Author: Eric Dumazet AuthorDate: Tue, 27 Mar 2018 20:22:33 -0700 Committer: Thomas Gleixner CommitDate: Wed, 28 Mar 2018 10:34:13 +0200 x86/msr: Make

[PATCH] x86, msr: fix rdmsrl_safe_on_cpu()

2018-03-27 Thread Eric Dumazet
() to schedule") Signed-off-by: Eric Dumazet Reported-by: kbuild test robot Cc: "H. Peter Anvin" Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov --- arch/x86/lib/msr-smp.c | 11 --- 1 file changed, 4 insertions(+), 7 deletions(-) diff --git a/arch/x86/lib/msr-smp.

Re: 07cde313b2 ("x86/msr: Allow rdmsr_safe_on_cpu() to schedule"): BUG: kernel hang in boot stage

2018-03-27 Thread Eric Dumazet
microseconds. > All usage sites are in preemptible context, convert rdmsr_safe_on_cpu() to > use a completion instead of busy polling. > Overall daemon cpu usage was reduced by 35 %, and latencies caused by > msr_read() disappeared. > Signed-of

[tip:x86/cleanups] x86/cpuid: Allow cpuid_read() to schedule

2018-03-27 Thread tip-bot for Eric Dumazet
Commit-ID: 67bbd7a8d6bcdc44cc27105ae8c374e9176ceaf1 Gitweb: https://git.kernel.org/tip/67bbd7a8d6bcdc44cc27105ae8c374e9176ceaf1 Author: Eric Dumazet AuthorDate: Fri, 23 Mar 2018 14:58:18 -0700 Committer: Thomas Gleixner CommitDate: Tue, 27 Mar 2018 12:01:48 +0200 x86/cpuid: Allow

[tip:x86/cleanups] x86/msr: Allow rdmsr_safe_on_cpu() to schedule

2018-03-27 Thread tip-bot for Eric Dumazet
Commit-ID: 07cde313b2d21f728cec2836db7cdb55476f7a26 Gitweb: https://git.kernel.org/tip/07cde313b2d21f728cec2836db7cdb55476f7a26 Author: Eric Dumazet AuthorDate: Fri, 23 Mar 2018 14:58:17 -0700 Committer: Thomas Gleixner CommitDate: Tue, 27 Mar 2018 12:01:47 +0200 x86/msr: Allow

Re: [PATCH v3 1/2] x86, msr: allow rdmsr_safe_on_cpu() to schedule

2018-03-24 Thread Eric Dumazet
On 03/24/2018 01:09 AM, Ingo Molnar wrote: > > * Eric Dumazet wrote: > >> I noticed high latencies caused by a daemon periodically reading >> various MSR on all cpus. KASAN kernels would see ~10ms latencies >> simply reading one MSR. Even without KASAN, sending

Re: [PATCH v3 2/2] x86, cpuid: allow cpuid_read() to schedule

2018-03-23 Thread Eric Dumazet
On 03/23/2018 03:17 PM, H. Peter Anvin wrote: > On 03/23/18 14:58, Eric Dumazet wrote: >> I noticed high latencies caused by a daemon periodically reading various >> MSR and cpuid on all cpus. KASAN kernels would see ~10ms latencies >> simply reading one cpuid. Even without

[PATCH v3 2/2] x86, cpuid: allow cpuid_read() to schedule

2018-03-23 Thread Eric Dumazet
consume hundreds of usec or more. Switching to smp_call_function_single_async() and a completion allows to reschedule and not burn cpu cycles. Signed-off-by: Eric Dumazet Cc: "H. Peter Anvin" Cc: Thomas Gleixner Cc: Borislav Petkov Cc: Ingo Molnar Cc: Hugh Dickins --- arch/

[PATCH v3 1/2] x86, msr: allow rdmsr_safe_on_cpu() to schedule

2018-03-23 Thread Eric Dumazet
hundreds of usec. Converts rdmsr_safe_on_cpu() to use a completion instead of busy polling. Overall daemon cpu usage was reduced by 35 %, and latencies caused by msr_read() disappeared. Signed-off-by: Eric Dumazet Cc: "H. Peter Anvin" Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav

Re: [PATCH v2 1/2] x86, msr: add rdmsr_safe_on_cpu_resched() and use it in msr_read()

2018-03-23 Thread Eric Dumazet
On 03/23/2018 02:27 PM, Thomas Gleixner wrote: > > Looking at all call sites. None of them is performance critical and all of > them are in preemptible context. > > So we simply can switch the rdmsr_safe_on_cpu() implementation over to wait > mode completely. SGTM, thanks for looking, I will se

Re: [PATCH] netlink: make sure nladdr has correct size in netlink_connect()

2018-03-23 Thread Eric Dumazet
Fixes: 1da177e4c3f41524 ("Linux-2.6.12-rc2") Reviewed-by: Eric Dumazet Thanks Alexander.

[PATCH v2 2/2] x86, cpuid: allow cpuid_read() to schedule

2018-03-19 Thread Eric Dumazet
consume hundreds of usec or more. Switching to smp_call_function_single_async() and a completion allows to reschedule and not burn cpu cycles. Signed-off-by: Eric Dumazet Cc: "H. Peter Anvin" Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Hugh Dickins --- arch/x86/kernel/cp

[PATCH v2 1/2] x86, msr: add rdmsr_safe_on_cpu_resched() and use it in msr_read()

2018-03-19 Thread Eric Dumazet
-by: Eric Dumazet Cc: "H. Peter Anvin" Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Hugh Dickins --- v2: fixed the missing part for !CONFIG_SMP arch/x86/include/asm/msr.h | 6 ++ arch/x86/kernel/msr.c | 2 +- arch/x86/lib/msr-smp.c | 43 +++

Re: [PATCH 1/2] x86, msr: add rdmsr_safe_on_cpu_resched() and use it in msr_read()

2018-03-18 Thread Eric Dumazet
op us a note to help improve the system] > url: https://github.com/0day-ci/linux/commits/Eric-Dumazet/x86-msr-add-rdmsr_safe_on_cpu_resched-and-use-it-in-msr_read/20180319-001007 > config: i386-randconfig-s1-201811 (attached as .config) > compiler: gcc-6 (Debian 6.4.0-9) 6.

[PATCH 2/2] x86, cpuid: allow cpuid_read() to schedule

2018-03-17 Thread Eric Dumazet
consume hundreds of usec or more. Switching to smp_call_function_single_async() and a completion allows to reschedule and not burn cpu cycles. Signed-off-by: Eric Dumazet Cc: "H. Peter Anvin" Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Hugh Dickins --- arch/x86/kernel/cp

[PATCH 1/2] x86, msr: add rdmsr_safe_on_cpu_resched() and use it in msr_read()

2018-03-17 Thread Eric Dumazet
-by: Eric Dumazet Cc: "H. Peter Anvin" Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Hugh Dickins --- arch/x86/include/asm/msr.h | 1 + arch/x86/kernel/msr.c | 2 +- arch/x86/lib/msr-smp.c | 43 ++ 3 files changed, 45 insertions(+), 1 deletio

Re: KASAN: use-after-free Read in pfifo_fast_enqueue

2018-03-14 Thread Eric Dumazet
On 03/14/2018 05:16 PM, Eric Dumazet wrote: > > typical use after free... > > diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c > index > 190570f21b208d5a17943360a3a6f85e1c2a2187..663e016491773f40f81d9bbfeab3dd68e1c2fc5c > 100644 > --- a/net/sched/s

Re: KASAN: use-after-free Read in pfifo_fast_enqueue

2018-03-14 Thread Eric Dumazet
On 03/14/2018 04:30 PM, syzbot wrote: > syzbot has found reproducer for the following crash on net-next commit > a870a02cc963de35452bbed932560ed69725c4f2 (Tue Mar 13 20:58:39 2018 +) > pktgen: use dynamic allocation for debug print buffer > > So far this crash happened 7 times on mmots, net-

Re: WARNING in __local_bh_enable_ip (2)

2018-03-14 Thread Eric Dumazet
On 03/14/2018 01:11 PM, syzbot wrote: Hello, syzbot hit the following crash on net-next commit be9fc0971a5c27b791608cf9705a04fe96dbd395 (Tue Mar 13 11:44:53 2018 +) net: fix sysctl_fb_tunnels_only_for_init_net link error So far this crash happened 2 times on net-next. Unfortunately, I don

Re: [PATCH] netlink: make sure nladdr has correct size in netlink_connect()

2018-03-14 Thread Eric Dumazet
On Wed, Mar 14, 2018 at 7:16 AM, Alexander Potapenko wrote: > > > > On Wed, Mar 14, 2018 at 3:11 PM Eric Dumazet wrote: >> >> On Wed, Mar 14, 2018 at 7:03 AM, Alexander Potapenko >> wrote: >> > KMSAN reports use of uninitialized memory in the case when |a

Re: [PATCH] netlink: make sure nladdr has correct size in netlink_connect()

2018-03-14 Thread Eric Dumazet
On Wed, Mar 14, 2018 at 7:03 AM, Alexander Potapenko wrote: > KMSAN reports use of uninitialized memory in the case when |alen| is > smaller than sizeof(struct netlink_sock), and therefore |nladdr| isn't > fully copied from the userspace. > > Signed-off-by: Alexander Potapenko > Fixes: 1da177e4c3

Re: [PATCH v2 1/1] net: check before dereferencing netdev_ops during busy poll

2018-03-12 Thread Eric Dumazet
ll; + else + busy_poll = NULL; do { rc = 0; We could instead setup a non NULL netdev_ops pointer on these 'dummy' devices to not add a check in fast path, but I presume we do not really care since this fix is for old kernels, and considering how long it took to discover this bug. Reviewed-by: Eric Dumazet

Re: WARNING in __proc_create

2018-03-09 Thread Eric Dumazet
On 03/09/2018 03:32 PM, Cong Wang wrote: On Fri, Mar 9, 2018 at 3:21 PM, Eric Dumazet wrote: On 03/09/2018 03:05 PM, Cong Wang wrote: BTW, the warning itself is all about empty names, so perhaps it's better to fix them separately. Huh ? You want more syzbot reports ? I do not

<    1   2   3   4   5   6   7   8   9   10   >