Re: Crash due to NULL dereference in tcp_rearm_rto

2018-04-13 Thread Eric Dumazet


On 04/13/2018 05:39 PM, Subash Abhinov Kasiviswanathan wrote:
> We are seeing a warning followed by a crash on an ARM64 device with
> Android 4.14 based kernel.
> 
> It looks like both sk->sk_write_queue and sk->sk_send_head are NULL.
> Since the sk->sk_write_queue is NULL and is dereferenced in tcp_rto_delta_us()
> to get the skb->skb_mstamp, there is crash observed.
> 
> Since this is 4.14.32, it already has commit ("tcp: reset sk_send_head in 
> tcp_write_queue_purge")
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit/?h=v4.14.34&id=dbbf2d1e4077bab0c65ece2765d3fc69cf7d610f
> 
> 12876.013077:   <6> WARNING: CPU: 5 PID: 14828 at net/ipv4/tcp_output.c:2469 
> tcp_send_loss_probe+0x198/0x1b8
> 12876.038939:   <6> task: ffe73f7e5a80 task.stack: ff801b068000
> 12876.038941:   <2> PC is at tcp_send_loss_probe+0x198/0x1b8
> 12876.038942:   <2> LR is at tcp_send_loss_probe+0x28/0x1b8
> 12876.038944:   <2> pc : [] lr : [] 
> pstate: 60400145
> 12876.038944:   <2> sp : ff800802bd30
> 12876.038945:   <2> x29: ff800802bd50 x28: ff8dc9d83eb8
> 12876.038948:   <2> x27: ff800802be08 x26: ff8dc9737000
> 12876.038950:   <2> x25: 0001 x24: ffe744ea1728
> 12876.038952:   <2> x23: ffe73f7e5a80 x22: 0558
> 12876.038954:   <2> x21:  x20: 01080020
> 12876.038956:   <2> x19: ffe73d06e440 x18: 0020
> 12876.038958:   <2> x17: 0014 x16: 0030
> 12876.038960:   <2> x15:  x14: 
> 12876.038962:   <2> x13: 13af314c x12: 002773f8a550
> 12876.038965:   <2> x11: 0538 x10: 
> 12876.038967:   <2> x9 : 0020 x8 : ffe73d06e5f0
> 12876.038969:   <2> x7 : 00924278 x6 : ffe76fe9ed80
> 12876.038971:   <2> x5 : ffe76fe9ed80 x4 : 0f500458
> 12876.038973:   <2> x3 : ff800802bce0 x2 : ff800802bce8
> 12876.038975:   <2> x1 :  x0 : 0558
> 12876.039082:   <2> [] tcp_send_loss_probe+0x198/0x1b8
> 12876.039084:   <2> [] tcp_write_timer_handler+0xf8/0x1c4
> 12876.039086:   <2> [] tcp_write_timer+0x5c/0x98
> 12876.039089:   <2> [] call_timer_fn+0xc0/0x1b4
> 12876.039091:   <2> [] run_timer_softirq+0x230/0x850
> 12876.039094:   <2> [] __do_softirq+0x1dc/0x3a4
> 12876.039096:   <2> [] irq_exit+0xc8/0xd4
> 12876.039098:   <2> [] __handle_domain_irq+0x8c/0xc4
> 12876.039099:   <2> [] gic_handle_irq+0x164/0x1bc
> 
> [net/ipv4/tcp_output.c]
> void tcp_send_loss_probe(struct sock *sk)
> {
> struct tcp_sock *tp = tcp_sk(sk);
> struct sk_buff *skb;
> int pcount;
> int mss = tcp_current_mss(sk);
> 
> ...
> 
> /* Retransmit last segment. */
> if (WARN_ON(!skb))
>     goto rearm_timer;
> 
> 12876.043967:   <6> Unable to handle kernel NULL pointer dereference at 
> virtual address 0010
> 12876.091600:   <6> Internal error: Oops: 9605 [#1] PREEMPT SMP
> 12876.152597:   <2> PC is at tcp_rearm_rto+0x48/0x90
> 12876.156979:   <2> LR is at tcp_send_loss_probe+0x178/0x1b8
> 12876.162077:   <2> pc : [] lr : [] 
> pstate: 60400145
> 12876.169657:   <2> sp : ff800802bd10
> 12876.173056:   <2> x29: ff800802bd20 x28: ff8dc9d83eb8
> 12876.178511:   <2> x27: ff800802be08 x26: ff8dc9737000
> 12876.183967:   <2> x25: 0001 x24: ffe744ea1728
> 12876.189418:   <2> x23: ffe73f7e5a80 x22: 0558
> 12876.194863:   <2> x21:  x20: 01080020
> 12876.200312:   <2> x19: ffe73d06e440 x18: 0020
> 12876.205758:   <2> x17: 0014 x16: 0030
> 12876.211212:   <2> x15:  x14: 
> 12876.216660:   <2> x13: 13af314c x12: 002773f8a550
> 12876.222108:   <2> x11: 0538 x10: 
> 12876.227561:   <2> x9 :  x8 : ffe73d06e5f0
> 12876.233008:   <2> x7 : 00924278 x6 : ffe76fe9ed80
> 12876.238455:   <2> x5 : ffe76fe9ed80 x4 : 0f500458
> 12876.243907:   <2> x3 : ff800802bce0 x2 : ff800802bce8
> 12876.249360:   <2> x1 :  x0 : 0867
> 12876.473522:   <2> [] tcp_rearm_rto+0x48/0x90
> 12876.478971:   <2> [] tcp_send_loss_probe+0x178/0x1b8
> 12876.485131:   <2> [] tcp_write_timer_handler+0xf8/0x1c4
> 12876.491557:   <2> [] tcp_write_timer+0x5c/0x98
> 12876.497189:   <2> [] call_timer_fn+0xc0/0x1b4
> 12876.502731:   <2> [] run_timer_softirq+0x230/0x850
> 12876.508716:   <2> [] __do_softirq+0x1dc/0x3a4
> 12876.514260:   <2> [] irq_exit+0xc8/0xd4
> 12876.519261:   <2> [] __handle_domain_irq+0x8c/0xc4
> 12876.525245:   <2> [] gic_handle_irq+0x164/0x1bc
> 
> [net/ipv4/tcp_input.c]
> void tcp_rearm_rto(struct sock *sk)
> {
> ...
>     inet_csk_clear_xmit_timer(sk, ICSK_TIME_RETRANS);
> } else {
>     u32 rto = inet_csk(sk)->icsk_rto;
>     /* Offset the time elapsed after installing regular RTO */
>     if (icsk->icsk_pen

Crash due to NULL dereference in tcp_rearm_rto

2018-04-13 Thread Subash Abhinov Kasiviswanathan

We are seeing a warning followed by a crash on an ARM64 device with
Android 4.14 based kernel.

It looks like both sk->sk_write_queue and sk->sk_send_head are NULL.
Since the sk->sk_write_queue is NULL and is dereferenced in 
tcp_rto_delta_us()

to get the skb->skb_mstamp, there is crash observed.

Since this is 4.14.32, it already has commit ("tcp: reset sk_send_head 
in tcp_write_queue_purge")

https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit/?h=v4.14.34&id=dbbf2d1e4077bab0c65ece2765d3fc69cf7d610f

12876.013077:   <6> WARNING: CPU: 5 PID: 14828 at 
net/ipv4/tcp_output.c:2469 tcp_send_loss_probe+0x198/0x1b8

12876.038939:   <6> task: ffe73f7e5a80 task.stack: ff801b068000
12876.038941:   <2> PC is at tcp_send_loss_probe+0x198/0x1b8
12876.038942:   <2> LR is at tcp_send_loss_probe+0x28/0x1b8
12876.038944:   <2> pc : [] lr : [] 
pstate: 60400145

12876.038944:   <2> sp : ff800802bd30
12876.038945:   <2> x29: ff800802bd50 x28: ff8dc9d83eb8
12876.038948:   <2> x27: ff800802be08 x26: ff8dc9737000
12876.038950:   <2> x25: 0001 x24: ffe744ea1728
12876.038952:   <2> x23: ffe73f7e5a80 x22: 0558
12876.038954:   <2> x21:  x20: 01080020
12876.038956:   <2> x19: ffe73d06e440 x18: 0020
12876.038958:   <2> x17: 0014 x16: 0030
12876.038960:   <2> x15:  x14: 
12876.038962:   <2> x13: 13af314c x12: 002773f8a550
12876.038965:   <2> x11: 0538 x10: 
12876.038967:   <2> x9 : 0020 x8 : ffe73d06e5f0
12876.038969:   <2> x7 : 00924278 x6 : ffe76fe9ed80
12876.038971:   <2> x5 : ffe76fe9ed80 x4 : 0f500458
12876.038973:   <2> x3 : ff800802bce0 x2 : ff800802bce8
12876.038975:   <2> x1 :  x0 : 0558
12876.039082:   <2> [] tcp_send_loss_probe+0x198/0x1b8
12876.039084:   <2> [] 
tcp_write_timer_handler+0xf8/0x1c4

12876.039086:   <2> [] tcp_write_timer+0x5c/0x98
12876.039089:   <2> [] call_timer_fn+0xc0/0x1b4
12876.039091:   <2> [] run_timer_softirq+0x230/0x850
12876.039094:   <2> [] __do_softirq+0x1dc/0x3a4
12876.039096:   <2> [] irq_exit+0xc8/0xd4
12876.039098:   <2> [] __handle_domain_irq+0x8c/0xc4
12876.039099:   <2> [] gic_handle_irq+0x164/0x1bc

[net/ipv4/tcp_output.c]
void tcp_send_loss_probe(struct sock *sk)
{
struct tcp_sock *tp = tcp_sk(sk);
struct sk_buff *skb;
int pcount;
int mss = tcp_current_mss(sk);

...

/* Retransmit last segment. */
if (WARN_ON(!skb))
goto rearm_timer;

12876.043967:   <6> Unable to handle kernel NULL pointer dereference at 
virtual address 0010

12876.091600:   <6> Internal error: Oops: 9605 [#1] PREEMPT SMP
12876.152597:   <2> PC is at tcp_rearm_rto+0x48/0x90
12876.156979:   <2> LR is at tcp_send_loss_probe+0x178/0x1b8
12876.162077:   <2> pc : [] lr : [] 
pstate: 60400145

12876.169657:   <2> sp : ff800802bd10
12876.173056:   <2> x29: ff800802bd20 x28: ff8dc9d83eb8
12876.178511:   <2> x27: ff800802be08 x26: ff8dc9737000
12876.183967:   <2> x25: 0001 x24: ffe744ea1728
12876.189418:   <2> x23: ffe73f7e5a80 x22: 0558
12876.194863:   <2> x21:  x20: 01080020
12876.200312:   <2> x19: ffe73d06e440 x18: 0020
12876.205758:   <2> x17: 0014 x16: 0030
12876.211212:   <2> x15:  x14: 
12876.216660:   <2> x13: 13af314c x12: 002773f8a550
12876.222108:   <2> x11: 0538 x10: 
12876.227561:   <2> x9 :  x8 : ffe73d06e5f0
12876.233008:   <2> x7 : 00924278 x6 : ffe76fe9ed80
12876.238455:   <2> x5 : ffe76fe9ed80 x4 : 0f500458
12876.243907:   <2> x3 : ff800802bce0 x2 : ff800802bce8
12876.249360:   <2> x1 :  x0 : 0867
12876.473522:   <2> [] tcp_rearm_rto+0x48/0x90
12876.478971:   <2> [] tcp_send_loss_probe+0x178/0x1b8
12876.485131:   <2> [] 
tcp_write_timer_handler+0xf8/0x1c4

12876.491557:   <2> [] tcp_write_timer+0x5c/0x98
12876.497189:   <2> [] call_timer_fn+0xc0/0x1b4
12876.502731:   <2> [] run_timer_softirq+0x230/0x850
12876.508716:   <2> [] __do_softirq+0x1dc/0x3a4
12876.514260:   <2> [] irq_exit+0xc8/0xd4
12876.519261:   <2> [] __handle_domain_irq+0x8c/0xc4
12876.525245:   <2> [] gic_handle_irq+0x164/0x1bc

[net/ipv4/tcp_input.c]
void tcp_rearm_rto(struct sock *sk)
{
...
inet_csk_clear_xmit_timer(sk, ICSK_TIME_RETRANS);
} else {
u32 rto = inet_csk(sk)->icsk_rto;
/* Offset the time elapsed after installing regular RTO */
if (icsk->icsk_pending == ICSK_TIME_REO_TIMEOUT ||
icsk->icsk_pending == ICSK_TIME_LOSS_PROBE) {
s64 delta_us = tcp_rto_delta_us(sk);
/*