My mitigation was accepted upstream:
https://lore.kernel.org/netdev/172708862651.3320223.2618494280244639290.git-
patchwork-not...@kernel.org/

@mruffell I believe this should be picked up by the stable maintainers.
Do you know how long it will take once it lands there until it's in an
official Ubuntu kernel?

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2077657

Title:
  Kernel Oops - BUG: kernel NULL pointer dereference, RIP:
  0010:tcp_rearm_rto+0xe4/0x160

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  We've recently started seeing the following crash on a # of machines
  we have running a ceph cluster. They are all running Ubuntu 20.04.6
  LTS.

  Jul 26 15:05:02 rx [11061395.780353] BUG: kernel NULL pointer dereference, 
address: 0000000000000020
  Jul 26 15:05:02 rx [11061395.787572] #PF: supervisor read access in kernel 
mode
  Jul 26 15:05:02 rx [11061395.792971] #PF: error_code(0x0000) - not-present 
page
  Jul 26 15:05:02 rx [11061395.798362] PGD 0 P4D 0 
  Jul 26 15:05:02 rx [11061395.801164] Oops: 0000 [#1] SMP NOPTI
  Jul 26 15:05:02 rx [11061395.805091] CPU: 0 PID: 9180 Comm: msgr-worker-1 
Tainted: G        W         5.4.0-174-generic #193-Ubuntu
  Jul 26 15:05:02 rx [11061395.814996] Hardware name: Supermicro SMC 2x26 
os-gen8 64C NVME-Y 256G/H12SSW-NTR, BIOS 2.5.V1.2U.NVMe.UEFI 05/09/2023
  Jul 26 15:05:02 rx [11061395.825952] RIP: 0010:tcp_rearm_rto+0xe4/0x160
  Jul 26 15:05:02 rx [11061395.830656] Code: 87 ca 04 00 00 00 5b 41 5c 41 5d 
5d c3 c3 49 8b bc 24 40 06 00 00 eb 8d 48 bb cf f7 53 e3 a5 9b c4 20 4c 89 ef 
e8 0c fe 0e 00 <48> 8b 78 20 48 c1 ef 03 48 89 f8 41 8b bc 24 80 04 00 00 48 f7 
e3
  Jul 26 15:05:02 rx [11061395.849665] RSP: 0018:ffffb75d40003e08 EFLAGS: 
00010246
  Jul 26 15:05:02 rx [11061395.855149] RAX: 0000000000000000 RBX: 
20c49ba5e353f7cf RCX: 0000000000000000
  Jul 26 15:05:02 rx [11061395.862542] RDX: 0000000062177c30 RSI: 
000000000000231c RDI: ffff9874ad283a60
  Jul 26 15:05:02 rx [11061395.869933] RBP: ffffb75d40003e20 R08: 
0000000000000000 R09: ffff987605e20aa8
  Jul 26 15:05:02 rx [11061395.877318] R10: ffffb75d40003f00 R11: 
ffffb75d4460f740 R12: ffff9874ad283900
  Jul 26 15:05:02 rx [11061395.884710] R13: ffff9874ad283a60 R14: 
ffff9874ad283980 R15: ffff9874ad283d30
  Jul 26 15:05:02 rx [11061395.892095] FS:  00007f1ef4a2e700(0000) 
GS:ffff987605e00000(0000) knlGS:0000000000000000
  Jul 26 15:05:02 rx [11061395.900438] CS:  0010 DS: 0000 ES: 0000 CR0: 
0000000080050033
  Jul 26 15:05:02 rx [11061395.906435] CR2: 0000000000000020 CR3: 
0000003e450ba003 CR4: 0000000000760ef0
  Jul 26 15:05:02 rx [11061395.913822] PKRU: 55555554
  Jul 26 15:05:02 rx [11061395.916786] Call Trace:
  Jul 26 15:05:02 rx [11061395.919488]  
  Jul 26 15:05:02 rx [11061395.921765]  ? show_regs.cold+0x1a/0x1f
  Jul 26 15:05:02 rx [11061395.925859]  ? __die+0x90/0xd9
  Jul 26 15:05:02 rx [11061395.929169]  ? no_context+0x196/0x380
  Jul 26 15:05:02 rx [11061395.933088]  ? ip6_protocol_deliver_rcu+0x4e0/0x4e0
  Jul 26 15:05:02 rx [11061395.938216]  ? ip6_sublist_rcv_finish+0x3d/0x50
  Jul 26 15:05:02 rx [11061395.943000]  ? __bad_area_nosemaphore+0x50/0x1a0
  Jul 26 15:05:02 rx [11061395.947873]  ? bad_area_nosemaphore+0x16/0x20
  Jul 26 15:05:02 rx [11061395.952486]  ? do_user_addr_fault+0x267/0x450
  Jul 26 15:05:02 rx [11061395.957104]  ? ipv6_list_rcv+0x112/0x140
  Jul 26 15:05:02 rx [11061395.961279]  ? __do_page_fault+0x58/0x90
  Jul 26 15:05:02 rx [11061395.965458]  ? do_page_fault+0x2c/0xe0
  Jul 26 15:05:02 rx [11061395.969465]  ? page_fault+0x34/0x40
  Jul 26 15:05:02 rx [11061395.973217]  ? tcp_rearm_rto+0xe4/0x160
  Jul 26 15:05:02 rx [11061395.977313]  ? tcp_rearm_rto+0xe4/0x160
  Jul 26 15:05:02 rx [11061395.981408]  tcp_send_loss_probe+0x10b/0x220
  Jul 26 15:05:02 rx [11061395.985937]  tcp_write_timer_handler+0x1b4/0x240
  Jul 26 15:05:02 rx [11061395.990809]  tcp_write_timer+0x9e/0xe0
  Jul 26 15:05:02 rx [11061395.994814]  ? tcp_write_timer_handler+0x240/0x240
  Jul 26 15:05:02 rx [11061395.999866]  call_timer_fn+0x32/0x130
  Jul 26 15:05:02 rx [11061396.003782]  __run_timers.part.0+0x180/0x280
  Jul 26 15:05:02 rx [11061396.008309]  ? recalibrate_cpu_khz+0x10/0x10
  Jul 26 15:05:02 rx [11061396.012841]  ? native_x2apic_icr_write+0x30/0x30
  Jul 26 15:05:02 rx [11061396.017718]  ? lapic_next_event+0x21/0x30
  Jul 26 15:05:02 rx [11061396.021984]  ? clockevents_program_event+0x8f/0xe0
  Jul 26 15:05:02 rx [11061396.027035]  run_timer_softirq+0x2a/0x50
  Jul 26 15:05:02 rx [11061396.031212]  __do_softirq+0xd1/0x2c1
  Jul 26 15:05:02 rx [11061396.035044]  do_softirq_own_stack+0x2a/0x40
  Jul 26 15:05:02 rx [11061396.039480]  
  Jul 26 15:05:02 rx [11061396.041840]  do_softirq.part.0+0x46/0x50
  Jul 26 15:05:02 rx [11061396.046022]  __local_bh_enable_ip+0x50/0x60
  Jul 26 15:05:02 rx [11061396.050460]  _raw_spin_unlock_bh+0x1e/0x20
  Jul 26 15:05:02 rx [11061396.054817]  nf_conntrack_tcp_packet+0x29e/0xbe0 
[nf_conntrack]
  Jul 26 15:05:02 rx [11061396.060994]  ? get_l4proto+0xe7/0x190 [nf_conntrack]
  Jul 26 15:05:02 rx [11061396.066220]  nf_conntrack_in+0xe9/0x670 
[nf_conntrack]
  Jul 26 15:05:02 rx [11061396.071618]  ipv6_conntrack_local+0x14/0x20 
[nf_conntrack]
  Jul 26 15:05:02 rx [11061396.077356]  nf_hook_slow+0x45/0xb0
  Jul 26 15:05:02 rx [11061396.081098]  ip6_xmit+0x3f0/0x5d0
  Jul 26 15:05:02 rx [11061396.084670]  ? ipv6_anycast_cleanup+0x50/0x50
  Jul 26 15:05:02 rx [11061396.089282]  ? __sk_dst_check+0x38/0x70
  Jul 26 15:05:02 rx [11061396.093381]  ? inet6_csk_route_socket+0x13b/0x200
  Jul 26 15:05:02 rx [11061396.098346]  inet6_csk_xmit+0xa7/0xf0
  Jul 26 15:05:02 rx [11061396.102263]  __tcp_transmit_skb+0x550/0xb30
  Jul 26 15:05:02 rx [11061396.106701]  tcp_write_xmit+0x3c6/0xc20
  Jul 26 15:05:02 rx [11061396.110792]  ? __alloc_skb+0x98/0x1d0
  Jul 26 15:05:02 rx [11061396.114708]  __tcp_push_pending_frames+0x37/0x100
  Jul 26 15:05:02 rx [11061396.119667]  tcp_push+0xfd/0x100
  Jul 26 15:05:02 rx [11061396.123150]  tcp_sendmsg_locked+0xc70/0xdd0
  Jul 26 15:05:02 rx [11061396.127588]  tcp_sendmsg+0x2d/0x50
  Jul 26 15:05:02 rx [11061396.131245]  inet6_sendmsg+0x43/0x70
  Jul 26 15:05:02 rx [11061396.135075]  __sock_sendmsg+0x48/0x70
  Jul 26 15:05:02 rx [11061396.138994]  ____sys_sendmsg+0x212/0x280
  Jul 26 15:05:02 rx [11061396.143172]  ___sys_sendmsg+0x88/0xd0
  Jul 26 15:05:02 rx [11061396.147098]  ? __seccomp_filter+0x7e/0x6b0
  Jul 26 15:05:02 rx [11061396.151446]  ? __switch_to+0x39c/0x460
  Jul 26 15:05:02 rx [11061396.155453]  ? __switch_to_asm+0x42/0x80
  Jul 26 15:05:02 rx [11061396.159636]  ? __switch_to_asm+0x5a/0x80
  Jul 26 15:05:02 rx [11061396.163816]  __sys_sendmsg+0x5c/0xa0
  Jul 26 15:05:02 rx [11061396.167647]  __x64_sys_sendmsg+0x1f/0x30
  Jul 26 15:05:02 rx [11061396.171832]  do_syscall_64+0x57/0x190
  Jul 26 15:05:02 rx [11061396.175748]  entry_SYSCALL_64_after_hwframe+0x5c/0xc1
  Jul 26 15:05:02 rx [11061396.181055] RIP: 0033:0x7f1ef692618d
  Jul 26 15:05:02 rx [11061396.184893] Code: 28 89 54 24 1c 48 89 74 24 10 89 
7c 24 08 e8 ca ee ff ff 8b 54 24 1c 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 2e 
00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 2f 44 89 c7 48 89 44 24 08 e8 fe ee ff ff 
48
  Jul 26 15:05:02 rx [11061396.203889] RSP: 002b:00007f1ef4a26aa0 EFLAGS: 
00000293 ORIG_RAX: 000000000000002e
  Jul 26 15:05:02 rx [11061396.211708] RAX: ffffffffffffffda RBX: 
000000000000084b RCX: 00007f1ef692618d
  Jul 26 15:05:02 rx [11061396.219091] RDX: 0000000000004000 RSI: 
00007f1ef4a26b10 RDI: 0000000000000275
  Jul 26 15:05:02 rx [11061396.226475] RBP: 0000000000004000 R08: 
0000000000000000 R09: 0000000000000020
  Jul 26 15:05:02 rx [11061396.233859] R10: 0000000000000000 R11: 
0000000000000293 R12: 000000000000084b
  Jul 26 15:05:02 rx [11061396.241243] R13: 00007f1ef4a26b10 R14: 
0000000000000275 R15: 000055592030f1e8
  Jul 26 15:05:02 rx [11061396.248628] Modules linked in: vrf bridge stp llc 
vxlan ip6_udp_tunnel udp_tunnel nls_iso8859_1 amd64_edac_mod edac_mce_amd 
kvm_amd kvm crct10dif_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd 
glue_helper wmi_bmof ipmi_ssif input_leds joydev rndis_host cdc_ether usbnet 
mii ast drm_vram_helper ttm drm_kms_helper i2c_algo_bit fb_sys_fops syscopyarea 
sysfillrect sysimgblt ccp mac_hid ipmi_si ipmi_devintf ipmi_msghandler nft_ct 
sch_fq_codel nf_tables_set nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables 
nfnetlink ramoops reed_solomon efi_pstore drm ip_tables x_tables autofs4 raid10 
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c raid0 multipath linear mlx5_ib ib_uverbs ib_core raid1 mlx5_core 
hid_generic pci_hyperv_intf crc32_pclmul tls usbhid ahci mlxfw bnxt_en libahci 
hid nvme i2c_piix4 nvme_core wmi
  Jul 26 15:05:02 rx [11061396.324334] CR2: 0000000000000020
  Jul 26 15:05:02 rx [11061396.327944] ---[ end trace 68a2b679d1cfb4f1 ]---
  Jul 26 15:05:02 rx [11061396.433435] RIP: 0010:tcp_rearm_rto+0xe4/0x160
  Jul 26 15:05:02 rx [11061396.438137] Code: 87 ca 04 00 00 00 5b 41 5c 41 5d 
5d c3 c3 49 8b bc 24 40 06 00 00 eb 8d 48 bb cf f7 53 e3 a5 9b c4 20 4c 89 ef 
e8 0c fe 0e 00 <48> 8b 78 20 48 c1 ef 03 48 89 f8 41 8b bc 24 80 04 00 00 48 f7 
e3
  Jul 26 15:05:02 rx [11061396.457144] RSP: 0018:ffffb75d40003e08 EFLAGS: 
00010246
  Jul 26 15:05:02 rx [11061396.462629] RAX: 0000000000000000 RBX: 
20c49ba5e353f7cf RCX: 0000000000000000
  Jul 26 15:05:02 rx [11061396.470012] RDX: 0000000062177c30 RSI: 
000000000000231c RDI: ffff9874ad283a60
  Jul 26 15:05:02 rx [11061396.477396] RBP: ffffb75d40003e20 R08: 
0000000000000000 R09: ffff987605e20aa8
  Jul 26 15:05:02 rx [11061396.484779] R10: ffffb75d40003f00 R11: 
ffffb75d4460f740 R12: ffff9874ad283900
  Jul 26 15:05:02 rx [11061396.492164] R13: ffff9874ad283a60 R14: 
ffff9874ad283980 R15: ffff9874ad283d30
  Jul 26 15:05:02 rx [11061396.499547] FS:  00007f1ef4a2e700(0000) 
GS:ffff987605e00000(0000) knlGS:0000000000000000
  Jul 26 15:05:02 rx [11061396.507886] CS:  0010 DS: 0000 ES: 0000 CR0: 
0000000080050033
  Jul 26 15:05:02 rx [11061396.513884] CR2: 0000000000000020 CR3: 
0000003e450ba003 CR4: 0000000000760ef0
  Jul 26 15:05:02 rx [11061396.521267] PKRU: 55555554
  Jul 26 15:05:02 rx [11061396.524230] Kernel panic - not syncing: Fatal 
exception in interrupt
  Jul 26 15:05:02 rx [11061396.530885] Kernel Offset: 0x1b200000 from 
0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
  Jul 26 15:05:03 rx [11061396.660181] ---[ end Kernel panic - not syncing: 
Fatal
   exception in interrupt ]---

  
  Earlier in the day we also saw the following WARNING get triggered:
  Jul 26 09:15:27  kernel: ------------[ cut here ]------------
  Jul 26 09:15:27  kernel: invalid inflight: 2 state 1 cwnd 68 mss 8988
  Jul 26 09:15:27  kernel: WARNING: CPU: 16 PID: 0 at 
net/ipv4/tcp_output.c:2605 tcp_send_loss_probe+0x214/0x220
  Jul 26 09:15:27  kernel: Modules linked in: vrf bridge stp llc vxlan 
ip6_udp_tunnel udp_tunnel nls_iso8859_1 nft_ct amd64_edac_mod edac_mce_amd 
kvm_amd kvm crct10dif_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd 
glue_helper wmi_bmof ipmi_ssif joydev input_leds rndis_host cdc_ether usbnet 
mii ast drm_vram_helper ttm drm_kms_he>
  Jul 26 09:15:27  kernel: CPU: 16 PID: 0 Comm: swapper/16 Not tainted 
5.4.0-174-generic #193-Ubuntu
  Jul 26 09:15:27  kernel: Hardware name: Supermicro SMC 2x26 os-gen8 64C 
NVME-Y 256G/H12SSW-NTR, BIOS 2.5.V1.2U.NVMe.UEFI 05/09/2023
  Jul 26 09:15:27  kernel: RIP: 0010:tcp_send_loss_probe+0x214/0x220
  Jul 26 09:15:27  kernel: Code: 08 26 01 00 75 e2 41 0f b6 54 24 12 41 8b 8c 
24 c0 06 00 00 45 89 f0 48 c7 c7 e0 b4 20 a7 c6 05 8d 08 26 01 01 e8 4a c0 0f 
00 <0f> 0b eb ba 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41
  Jul 26 09:15:27  kernel: RSP: 0018:ffffb7838088ce00 EFLAGS: 00010286
  Jul 26 09:15:27  kernel: RAX: 0000000000000000 RBX: ffff9b84b5630430 RCX: 
0000000000000006
  Jul 26 09:15:27  kernel: RDX: 0000000000000007 RSI: 0000000000000096 RDI: 
ffff9b8e4621c8c0
  Jul 26 09:15:27  kernel: RBP: ffffb7838088ce18 R08: 0000000000000927 R09: 
0000000000000004
  Jul 26 09:15:27  kernel: R10: 0000000000000000 R11: 0000000000000001 R12: 
ffff9b84b5630000
  Jul 26 09:15:27  kernel: R13: 0000000000000000 R14: 000000000000231c R15: 
ffff9b84b5630430
  Jul 26 09:15:27  kernel: FS:  0000000000000000(0000) 
GS:ffff9b8e46200000(0000) knlGS:0000000000000000
  Jul 26 09:15:27  kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  Jul 26 09:15:27  kernel: CR2: 000056238cec2380 CR3: 0000003e49ede005 CR4: 
0000000000760ee0
  Jul 26 09:15:27  kernel: PKRU: 55555554
  Jul 26 09:15:27  kernel: Call Trace:
  Jul 26 09:15:27  kernel:  <IRQ>
  Jul 26 09:15:27  kernel:  ? show_regs.cold+0x1a/0x1f
  Jul 26 09:15:27  kernel:  ? __warn+0x98/0xe0
  Jul 26 09:15:27  kernel:  ? tcp_send_loss_probe+0x214/0x220
  Jul 26 09:15:27  kernel:  ? report_bug+0xd1/0x100
  Jul 26 09:15:27  kernel:  ? do_error_trap+0x9b/0xc0
  Jul 26 09:15:27  kernel:  ? do_invalid_op+0x3c/0x50
  Jul 26 09:15:27  kernel:  ? tcp_send_loss_probe+0x214/0x220
  Jul 26 09:15:27  kernel:  ? invalid_op+0x1e/0x30
  Jul 26 09:15:27  kernel:  ? tcp_send_loss_probe+0x214/0x220
  Jul 26 09:15:27  kernel:  tcp_write_timer_handler+0x1b4/0x240
  Jul 26 09:15:27  kernel:  tcp_write_timer+0x9e/0xe0
  Jul 26 09:15:27  kernel:  ? tcp_write_timer_handler+0x240/0x240
  Jul 26 09:15:27  kernel:  call_timer_fn+0x32/0x130
  Jul 26 09:15:27  kernel:  __run_timers.part.0+0x180/0x280
  Jul 26 09:15:27  kernel:  ? timerqueue_add+0x9b/0xb0
  Jul 26 09:15:27  kernel:  ? enqueue_hrtimer+0x3d/0x90
  Jul 26 09:15:27  kernel:  ? do_error_trap+0x9b/0xc0
  Jul 26 09:15:27  kernel:  ? do_invalid_op+0x3c/0x50
  Jul 26 09:15:27  kernel:  ? tcp_send_loss_probe+0x214/0x220
  Jul 26 09:15:27  kernel:  ? invalid_op+0x1e/0x30
  Jul 26 09:15:27  kernel:  ? tcp_send_loss_probe+0x214/0x220
  Jul 26 09:15:27  kernel:  tcp_write_timer_handler+0x1b4/0x240
  Jul 26 09:15:27  kernel:  tcp_write_timer+0x9e/0xe0
  Jul 26 09:15:27  kernel:  ? tcp_write_timer_handler+0x240/0x240
  Jul 26 09:15:27  kernel:  call_timer_fn+0x32/0x130
  Jul 26 09:15:27  kernel:  __run_timers.part.0+0x180/0x280
  Jul 26 09:15:27  kernel:  ? timerqueue_add+0x9b/0xb0
  Jul 26 09:15:27  kernel:  ? enqueue_hrtimer+0x3d/0x90
  Jul 26 09:15:27  kernel:  ? recalibrate_cpu_khz+0x10/0x10
  Jul 26 09:15:27  kernel:  ? ktime_get+0x3e/0xa0
  Jul 26 09:15:27  kernel:  ? native_x2apic_icr_write+0x30/0x30
  Jul 26 09:15:27  kernel:  run_timer_softirq+0x2a/0x50
  Jul 26 09:15:27  kernel:  __do_softirq+0xd1/0x2c1
  Jul 26 09:15:27  kernel:  irq_exit+0xae/0xb0
  Jul 26 09:15:27  kernel:  smp_apic_timer_interrupt+0x7b/0x140
  Jul 26 09:15:27  kernel:  apic_timer_interrupt+0xf/0x20
  Jul 26 09:15:27  kernel:  </IRQ>
  Jul 26 09:15:27  kernel: RIP: 0010:native_safe_halt+0xe/0x10
  Jul 26 09:15:27  kernel: Code: 7b ff ff ff eb bd 90 90 90 90 90 90 e9 07 00 
00 00 0f 00 2d 36 2c 50 00 f4 c3 66 90 e9 07 00 00 00 0f 00 2d 26 2c 50 00 fb 
f4 <c3> 90 0f 1f 44 00 00 55 48 89 e5 41 55 41 54 53 e8 dd 5e 61 ff 65
  Jul 26 09:15:27  kernel: RSP: 0018:ffffb783801cfe70 EFLAGS: 00000246 
ORIG_RAX: ffffffffffffff13
  Jul 26 09:15:27  kernel: RAX: ffffffffa6908b20 RBX: 0000000000000010 RCX: 
0000000000000001
  Jul 26 09:15:27  kernel: RDX: 000000006fc0c97e RSI: 0000000000000082 RDI: 
0000000000000082
  Jul 26 09:15:27  kernel: RBP: ffffb783801cfe90 R08: 0000000000000000 R09: 
0000000000000225
  Jul 26 09:15:27  kernel: R10: 0000000000100000 R11: 0000000000000000 R12: 
0000000000000010
  Jul 26 09:15:27  kernel: R13: ffff9b8e390b0000 R14: 0000000000000000 R15: 
0000000000000000
  Jul 26 09:15:27  kernel:  ? __cpuidle_text_start+0x8/0x8
  Jul 26 09:15:27  kernel:  ? default_idle+0x20/0x140
  Jul 26 09:15:27  kernel:  arch_cpu_idle+0x15/0x20
  Jul 26 09:15:27  kernel:  default_idle_call+0x23/0x30
  Jul 26 09:15:27  kernel:  do_idle+0x1fb/0x270
  Jul 26 09:15:27  kernel:  cpu_startup_entry+0x20/0x30
  Jul 26 09:15:27  kernel:  start_secondary+0x178/0x1d0
  Jul 26 09:15:27  kernel:  secondary_startup_64+0xa4/0xb0
  Jul 26 09:15:27  kernel: ---[ end trace e7ac822987e33be1 ]---

  The warning suggests there is an inconsistency in the accounting of
  tp->packets_out where the kernel has a non-zero value for the counter,
  but there are no packets in the transmit or retrans queues.

  I grabbed the debug symbols for this kernel and looked up the RIP of
  where we're seeing the null ptr deref. It points to here:

  (gdb) list *(tcp_rearm_rto+0xe4)
  0xffffffff819a2d64 is in tcp_rearm_rto 
(/build/linux-G34Rii/linux-5.4.0/include/linux/math64.h:126).
  warning: Source file is more recent than executable.
  121     */
  122    #ifndef div_u64
  123    static inline u64 div_u64(u64 dividend, u32 divisor)
  124    {
  125        u32 remainder;
  126        return div_u64_rem(dividend, divisor, &remainder);
  127    }
  128    #endif
  129    
  130    /**  

  Digging into the objdump we can see 0xffffffff819a2d64 here:

  ffffffff819a2d5c:       4c 89 ef                mov    %r13,%rdi
  ffffffff819a2d5f:       e8 0c fe 0e 00          callq  ffffffff81a92b70 
<rb_first>
  >>> ffffffff819a2d64:       48 8b 78 20             mov    0x20(%rax),%rdi
  ffffffff819a2d68:       48 c1 ef 03             shr    $0x3,%rdi
  ffffffff819a2d6c:       48 89 f8                mov    %rdi,%rax
          u64 rto_time_stamp_us = tcp_skb_timestamp_us(skb) + 
jiffies_to_usecs(rto);
  ffffffff819a2d6f:       41 8b bc 24 80 04 00    mov    0x480(%r12),%edi
  ffffffff819a2d76:       00
  ffffffff819a2d77:       48 f7 e3                mul    %rbx
  ffffffff819a2d7a:       48 89 d3                mov    %rdx,%rbx
  ffffffff819a2d7d:       e8 9e 3a 78 ff          callq  ffffffff81126820 
<jiffies_to_usecs>

  which is part of this code:

  /* At how many usecs into the future should the RTO fire? */
  static inline s64 tcp_rto_delta_us(const struct sock *sk)
  {
          const struct sk_buff *skb = tcp_rtx_queue_head(sk);
          u32 rto = inet_csk(sk)->icsk_rto;
  >>>        u64 rto_time_stamp_us = tcp_skb_timestamp_us(skb) + 
jiffies_to_usecs(rto);

  and eventually we see that tcp_skb_timestamp_us() looks like this:
  /* provide the departure time in us unit */
  static inline u64 tcp_skb_timestamp_us(const struct sk_buff *skb)
  {
          return div_u64(skb->skb_mstamp_ns, NSEC_PER_USEC);
  } 

  This points us back to the null ptr deref coming in the call to
  div_u64() as the original gdb disassembly showed. The failing insn is
  "mov    0x20(%rax),%rdi" and skb_mstamp_ns is at offset 0x20 of struct
  sk_buff:

  $ pahole sk_buff /usr/lib/debug/boot/vmlinux-5.4.0-174-generic 
  struct sk_buff {
      union {
          struct {
              struct sk_buff * next;           /*     0     8 */
              struct sk_buff * prev;           /*     8     8 */
              union {
                  struct net_device * dev; /*    16     8 */
                  long unsigned int dev_scratch; /*    16     8 */
              };                               /*    16     8 */
          };                                       /*     0    24 */
          struct rb_node     rbnode;               /*     0    24 */
          struct list_head   list;                 /*     0    16 */
      };                                               /*     0    24 */
      union {
          struct sock *      sk;                   /*    24     8 */
          int                ip_defrag_offset;     /*    24     4 */
      };                                               /*    24     8 */
      union {
          ktime_t            tstamp;               /*    32     8 */
          u64                skb_mstamp_ns;        /*    32     8 */

  With this we now know that the skb was NULL which triggered the crash.
  This appears to b/c this call gave us a NULL skb:

  const struct sk_buff *skb = tcp_rtx_queue_head(sk);

  meaning the retransmit queue was empty.

  The same issue was hit a # of years ago as reported here:
  
https://lore.kernel.org/netdev/86c0f836-9a7c-438b-d81a-839be45f1...@gmail.com/T/#t

  with the following commit intended to fix it:

  commit bffd168c3fc5cc7d2bad4c668fa90e7a9010db4b
  Author: Soheil Hassas Yeganeh <soh...@google.com>
  Date:   Sat Apr 14 20:44:46 2018 -0400

  
      tcp: clear tp->packets_out when purging write queue

  however that commit is in the kernel version which saw this crash.

  $ git describe --contains bffd168c3fc5cc7d2bad4c668fa90e7a9010db4b
  v5.0~2050^2~28

  We've been able to work around the problem by setting
  tcp_early_retrans and tcp_recovery to 0 that way we won't hit the
  REO_TIMEOUT and LOSS_PROBE states and so will not call
  tcp_rto_delta_us():

          if (!tp->packets_out) {
                  inet_csk_clear_xmit_timer(sk, ICSK_TIME_RETRANS);
          } else {
                  u32 rto = inet_csk(sk)->icsk_rto;
                  /* Offset the time elapsed after installing regular RTO */
                  if (icsk->icsk_pending == ICSK_TIME_REO_TIMEOUT ||
                      icsk->icsk_pending == ICSK_TIME_LOSS_PROBE) {
                          s64 delta_us = tcp_rto_delta_us(sk);

  As mentioned earlier there seems to be a problem with the packets_out
  accounting here still which causes the kernel to think there are
  either still packets in the write or retransmit queues. Additionally I
  feel like we should probably add a patch in tcp_rto_delta_us() to
  check that the skb is non-null b/f trying to read the timestamp.
  Possibly something like this:

  Author: Josh Hunt <joh...@akamai.com>
  Date:   Tue Jul 30 19:45:43 2024 -0400

      workaround empty rtx queue

  diff --git a/include/net/tcp.h b/include/net/tcp.h
  index 2aac11e7e1cc..d1e2ecbce536 100644
  --- a/include/net/tcp.h
  +++ b/include/net/tcp.h
  @@ -2434,7 +2434,8 @@ static inline s64 tcp_rto_delta_us(const struct sock 
*sk)
   {
          const struct sk_buff *skb = tcp_rtx_queue_head(sk);
          u32 rto = inet_csk(sk)->icsk_rto;
  -       u64 rto_time_stamp_us = tcp_skb_timestamp_us(skb) + 
jiffies_to_usecs(rto);
  +       u64 rto_time_stamp_us = skb ? tcp_skb_timestamp_us(skb) : 0 + 
jiffies_to_usecs(rto);
  +       WARN_ON_ONCE(!skb);
   
          return rto_time_stamp_us - tcp_sk(sk)->tcp_mstamp;
   }

  I tried looking to see if there are any new commits upstream either in
  Ubuntu's 5.4 kernels or upstream's net-next branch which could resolve
  this problem, but did not find anything. We've also tried reproducing
  this in a lab environment without success so far.

  Please let me know if you need any other information to help root
  cause this issue.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2077657/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to