Re: net/ipv4: division by 0 in tcp_select_window

2017-03-03 Thread Eric Dumazet
On Fri, 2017-03-03 at 10:25 -0800, Eric Dumazet wrote:
> On Fri, Mar 3, 2017 at 10:10 AM, Dmitry Vyukov  wrote:
> > Hello,
> >
> > The following program triggers division by 0 in tcp_select_window:
> >
> > https://gist.githubusercontent.com/dvyukov/ef28c0fd2ab57a655508ef7621b12e6c/raw/079011e2a9523a390b0621cbc1e5d9d5e637fd6d/gistfile1.txt
> 
> Yeah, tcp_disconnect() should never have existed in the first place.
> 
> We'll send a patch, unless you take care of this before us .

Could you try this first patch ?

Probably others will also be needed.

diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
index 
40d893556e6701ace6a02903e53c45822d6fa56d..2187ebf1f270d19e6dd019b8f9df5eef8d018e03
 100644
--- a/net/ipv4/tcp_timer.c
+++ b/net/ipv4/tcp_timer.c
@@ -552,7 +552,8 @@ void tcp_write_timer_handler(struct sock *sk)
struct inet_connection_sock *icsk = inet_csk(sk);
int event;
 
-   if (sk->sk_state == TCP_CLOSE || !icsk->icsk_pending)
+   if (((1 << sk->sk_state) & (TCPF_CLOSE | TCPF_LISTEN)) ||
+   !icsk->icsk_pending)
goto out;
 
if (time_after(icsk->icsk_timeout, jiffies)) {




Re: net/ipv4: division by 0 in tcp_select_window

2017-03-03 Thread Eric Dumazet
On Fri, Mar 3, 2017 at 10:24 AM, Dmitry Vyukov  wrote:
> On Fri, Mar 3, 2017 at 7:10 PM, Dmitry Vyukov  wrote:
>> Hello,
>>

> Wonder if this has been causing other crashes like this one?
>
> [ cut here ]
> kernel BUG at net/ipv4/tcp_output.c:2748!
> Call Trace:
>  
>  tcp_retransmit_skb+0x2e/0x230 net/ipv4/tcp_output.c:2822
>  tcp_retransmit_timer+0x104c/0x2d50 net/ipv4/tcp_timer.c:491
>  tcp_write_timer_handler+0x334/0x9d0 net/ipv4/tcp_timer.c:574
>  tcp_write_timer+0x164/0x180 net/ipv4/tcp_timer.c:592
>  call_timer_fn+0x241/0x820 kernel/time/timer.c:1266
>  expire_timers kernel/time/timer.c:1305 [inline]
>  __run_timers+0x960/0xcf0 kernel/time/timer.c:1599
>  run_timer_softirq+0x21/0x80 kernel/time/timer.c:1612
>  __do_softirq+0x31f/0xbe7 kernel/softirq.c:284
>  invoke_softirq kernel/softirq.c:364 [inline]
>  irq_exit+0x1cc/0x200 kernel/softirq.c:405
>  exiting_irq arch/x86/include/asm/apic.h:658 [inline]
>  smp_apic_timer_interrupt+0x76/0xa0 arch/x86/kernel/apic/apic.c:962
>  apic_timer_interrupt+0x93/0xa0 arch/x86/entry/entry_64.S:487
>
> if (before(TCP_SKB_CB(skb)->seq, tp->snd_una)) {
>   if (before(TCP_SKB_CB(skb)->end_seq, tp->snd_una))
> BUG();

This path uses a socket lock. Probably different problem.


Re: net/ipv4: division by 0 in tcp_select_window

2017-03-03 Thread Eric Dumazet
On Fri, Mar 3, 2017 at 10:10 AM, Dmitry Vyukov  wrote:
> Hello,
>
> The following program triggers division by 0 in tcp_select_window:
>
> https://gist.githubusercontent.com/dvyukov/ef28c0fd2ab57a655508ef7621b12e6c/raw/079011e2a9523a390b0621cbc1e5d9d5e637fd6d/gistfile1.txt

Yeah, tcp_disconnect() should never have existed in the first place.

We'll send a patch, unless you take care of this before us .

Thanks.


Re: net/ipv4: division by 0 in tcp_select_window

2017-03-03 Thread Dmitry Vyukov
On Fri, Mar 3, 2017 at 7:10 PM, Dmitry Vyukov  wrote:
> Hello,
>
> The following program triggers division by 0 in tcp_select_window:
>
> https://gist.githubusercontent.com/dvyukov/ef28c0fd2ab57a655508ef7621b12e6c/raw/079011e2a9523a390b0621cbc1e5d9d5e637fd6d/gistfile1.txt
>
> divide error:  [#1] SMP KASAN
> Modules linked in:
> CPU: 3 PID: 0 Comm: swapper/3 Not tainted 4.10.0+ #270
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> task: 88006c236340 task.stack: 88006c248000
> RIP: 0010:__tcp_select_window+0x6db/0x920 net/ipv4/tcp_output.c:2585
> RSP: 0018:88006cf86b40 EFLAGS: 00010206
> RAX: 00c4 RBX: 88006cf86cd8 RCX: dc00
> RDX:  RSI:  RDI: 8800686228bd
> RBP: 88006cf86d00 R08:  R09: 
> R10: 0004 R11: ed000d9f0e18 R12: 00c4
> R13: a700 R14: 880068622040 R15: 
> FS:  () GS:88006cf8() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2: 20e5cff8 CR3: 05021000 CR4: 001406e0
> Call Trace:
>  
>  tcp_select_window net/ipv4/tcp_output.c:270 [inline]
>  tcp_transmit_skb+0xc35/0x3460 net/ipv4/tcp_output.c:1014
>  tcp_xmit_probe_skb+0x36d/0x440 net/ipv4/tcp_output.c:3528
>  tcp_write_wakeup+0x23b/0x6d0 net/ipv4/tcp_output.c:3577
>  tcp_send_probe0+0xbf/0x5d0 net/ipv4/tcp_output.c:3593
>  tcp_probe_timer net/ipv4/tcp_timer.c:362 [inline]
>  tcp_write_timer_handler+0x849/0x9d0 net/ipv4/tcp_timer.c:578
>  tcp_write_timer+0x164/0x180 net/ipv4/tcp_timer.c:592
>  call_timer_fn+0x241/0x820 kernel/time/timer.c:1266
>  expire_timers kernel/time/timer.c:1305 [inline]
>  __run_timers+0x960/0xcf0 kernel/time/timer.c:1599
>  run_timer_softirq+0x21/0x80 kernel/time/timer.c:1612
>  __do_softirq+0x31f/0xbe7 kernel/softirq.c:284
>  invoke_softirq kernel/softirq.c:364 [inline]
>  irq_exit+0x1cc/0x200 kernel/softirq.c:405
>  exiting_irq arch/x86/include/asm/apic.h:658 [inline]
>  smp_apic_timer_interrupt+0x76/0xa0 arch/x86/kernel/apic/apic.c:962
>  apic_timer_interrupt+0x93/0xa0 arch/x86/entry/entry_64.S:487
> RIP: 0010:native_safe_halt+0x6/0x10 arch/x86/include/asm/irqflags.h:53
> RSP: 0018:88006c24fc10 EFLAGS: 0282 ORIG_RAX: ff10
> RAX: dc00 RBX: 11000d849f85 RCX: 
> RDX: 10a18ebc RSI: 0001 RDI: 850c75e0
> RBP: 88006c24fc10 R08: 88007fff70dc R09: 
> R10:  R11:  R12: 11000d849fa9
> R13: 88006c24fcc8 R14: 856972b8 R15: 88006c24fe68
>  
>  arch_safe_halt arch/x86/include/asm/paravirt.h:98 [inline]
>  default_idle+0xbf/0x440 arch/x86/kernel/process.c:271
>  arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:262
>  default_idle_call+0x36/0x90 kernel/sched/idle.c:96
>  cpuidle_idle_call kernel/sched/idle.c:154 [inline]
>  do_idle+0x373/0x520 kernel/sched/idle.c:243
>  cpu_startup_entry+0x18/0x20 kernel/sched/idle.c:345
>  start_secondary+0x36c/0x460 arch/x86/kernel/smpboot.c:272
>  start_cpu+0x14/0x14 arch/x86/kernel/head_64.S:306
> Code: 5d c3 e8 99 0d e5 fd 45 85 ff 44 89 bd 74 fe ff ff 0f 8f 14 fc
> ff ff 45 31 e4 eb 93 e8 7f 0d e5 fd 8b b5 74 fe ff ff 44 89 e0 99 
> fe 41 89 c4 44 0f af e6 e9 71 ff ff ff e8 62 0d e5 fd 4c 89
> RIP: __tcp_select_window+0x6db/0x920 net/ipv4/tcp_output.c:2585 RSP:
> 88006cf86b40
> ---[ end trace 5efcbe8231e36800 ]---
>
> On commit c82be9d2244aacea9851c86f4fb74694c99cd874.
>
> The guy that resets mss seems to be
> inet_csk_listen_start->inet_csk_delack_init. After that the timer
> fires and divides by icsk->icsk_ack.rcv_mss==0.


Wonder if this has been causing other crashes like this one?

[ cut here ]
kernel BUG at net/ipv4/tcp_output.c:2748!
Call Trace:
 
 tcp_retransmit_skb+0x2e/0x230 net/ipv4/tcp_output.c:2822
 tcp_retransmit_timer+0x104c/0x2d50 net/ipv4/tcp_timer.c:491
 tcp_write_timer_handler+0x334/0x9d0 net/ipv4/tcp_timer.c:574
 tcp_write_timer+0x164/0x180 net/ipv4/tcp_timer.c:592
 call_timer_fn+0x241/0x820 kernel/time/timer.c:1266
 expire_timers kernel/time/timer.c:1305 [inline]
 __run_timers+0x960/0xcf0 kernel/time/timer.c:1599
 run_timer_softirq+0x21/0x80 kernel/time/timer.c:1612
 __do_softirq+0x31f/0xbe7 kernel/softirq.c:284
 invoke_softirq kernel/softirq.c:364 [inline]
 irq_exit+0x1cc/0x200 kernel/softirq.c:405
 exiting_irq arch/x86/include/asm/apic.h:658 [inline]
 smp_apic_timer_interrupt+0x76/0xa0 arch/x86/kernel/apic/apic.c:962
 apic_timer_interrupt+0x93/0xa0 arch/x86/entry/entry_64.S:487

if (before(TCP_SKB_CB(skb)->seq, tp->snd_una)) {
  if (before(TCP_SKB_CB(skb)->end_seq, tp->snd_una))
BUG();