Public bug reported:

Hey!

After upgrading a few VPN to 4.15.0-38.41 (either Xenial or Bionic), we
get random crashes. This also happens with the 4.18 in bionic-proposed.
These crashes didn't happen with 4.4 from Xenial. Here is a stack trace:

[   31.154360] BUG: unable to handle kernel NULL pointer dereference at 
0000000000000038
[   31.162233] PGD 0 P4D 0
[   31.164786] Oops: 0000 [#1] SMP PTI
[   31.168291] CPU: 5 PID: 42 Comm: ksoftirqd/5 Not tainted 4.18.0-11-generic 
#12~18.04.1-Ubuntu
[   31.176854] Hardware name: Supermicro Super Server/X10SDV-4C-7TP4F, BIOS 
1.0b 11/21/2016
[   31.184980] RIP: 0010:vti_rcv_cb+0xb9/0x1a0 [ip_vti]
[   31.189962] Code: 8b 44 24 70 0f c8 89 87 b4 00 00 00 48 8b 86 20 05 00 00 
8b 80 f8 14 00 00 85 c0 75 05 48 85 d2 74 0e 48 8b 43 58 48 83 e0 fe <f6> 40 38 
04 74 7d 44 89 b3 b4 00 00 00 49 8b 44 24 20 48 39 86 20
[   31.208916] RSP: 0018:ffffbc61832e3920 EFLAGS: 00010246
[   31.214160] RAX: 0000000000000000 RBX: ffff9a3504964a00 RCX: 0000000000000002
[   31.221328] RDX: ffff9a351add4080 RSI: ffff9a351aa08000 RDI: ffff9a3504964a00
[   31.228485] RBP: ffffbc61832e3940 R08: 0000000000000004 R09: ffffffffc0aa612b
[   31.235643] R10: 0008f09b99881884 R11: 1884bd4e2d6b1fac R12: ffff9a3507b31900
[   31.242803] R13: ffff9a3507b31000 R14: 0000000000000000 R15: ffff9a3504964a00
[   31.249964] FS:  0000000000000000(0000) GS:ffff9a35bfd40000(0000) 
knlGS:0000000000000000
[   31.258077] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   31.263848] CR2: 0000000000000038 CR3: 000000041a40a003 CR4: 00000000003606e0
[   31.271004] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   31.278163] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   31.285320] Call Trace:
[   31.287789]  xfrm4_rcv_cb+0x4a/0x70
[   31.291297]  xfrm_input+0x58f/0x8f0
[   31.294807]  vti_input+0xaa/0x110 [ip_vti]
[   31.298926]  vti_rcv+0x33/0x3c [ip_vti]
[   31.302783]  xfrm4_esp_rcv+0x39/0x50
[   31.306375]  ip_local_deliver_finish+0x62/0x200
[   31.310923]  ip_local_deliver+0xdf/0xf0
[   31.314775]  ? ip_rcv_finish+0x420/0x420
[   31.318718]  ip_rcv_finish+0x126/0x420
[   31.322486]  ip_rcv+0x28f/0x360
[   31.325655]  ? inet_del_offload+0x40/0x40
[   31.329686]  __netif_receive_skb_core+0x48c/0xb70
[   31.334413]  ? kmem_cache_alloc+0xb4/0x1d0
[   31.338532]  ? __build_skb+0x2b/0xf0
[   31.342128]  __netif_receive_skb+0x18/0x60
[   31.346244]  ? __netif_receive_skb+0x18/0x60
[   31.350536]  netif_receive_skb_internal+0x45/0xe0
[   31.355263]  napi_gro_receive+0xc5/0xf0
[   31.359141]  mlx5e_handle_rx_cqe+0x1b2/0x5d0 [mlx5_core]
[   31.364476]  ? skb_release_all+0x24/0x30
[   31.368430]  mlx5e_poll_rx_cq+0xd3/0x990 [mlx5_core]
[   31.373432]  mlx5e_napi_poll+0x9b/0xc60 [mlx5_core]
[   31.378333]  ? __switch_to_asm+0x34/0x70
[   31.382270]  ? __switch_to_asm+0x40/0x70
[   31.386214]  ? __switch_to_asm+0x34/0x70
[   31.391056]  ? __switch_to_asm+0x40/0x70
[   31.395905]  ? __switch_to_asm+0x34/0x70
[   31.400743]  net_rx_action+0x140/0x3a0
[   31.405379]  ? __switch_to+0xad/0x500
[   31.409887]  __do_softirq+0xe4/0x2bb
[   31.414448]  run_ksoftirqd+0x2b/0x40
[   31.418862]  smpboot_thread_fn+0xfc/0x170
[   31.423700]  kthread+0x121/0x140
[   31.427701]  ? sort_range+0x30/0x30
[   31.432040]  ? kthread_create_worker_on_cpu+0x70/0x70
[   31.437816]  ret_from_fork+0x35/0x40
[   31.442219] Modules linked in: esp6 authenc echainiv xfrm6_mode_tunnel 
xfrm4_mode_tunnel xfrm_user xfrm4_tunnel tunnel4 ipcomp xfrm_ipcomp esp4 ah4 
af_key xfrm_algo ip_vti ip_tunnel ip6_vti ip6_tunnel tunnel6 8021q garp mrp stp 
llc bonding ipt_REJECT nf_reject_ipv4 nfnetlink_log n
fnetlink xt_NFLOG xt_hl xt_limit xt_nat xt_TCPMSS xt_HL xt_comment xt_tcpudp 
xt_multiport xt_conntrack iptable_filter iptable_nat nf_conntrack_ipv4 
nf_defrag_ipv4 nf_nat_ipv4 nf_nat xt_connmark xt_mark iptable_mangle xt_CT 
nf_conntrack xt_addrtype iptable_raw bpfilter ipmi_ssif gpio_
ich intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel 
kvm irqbypass intel_cstate intel_rapl_perf input_leds joydev mei_me 
intel_pch_thermal ioatdma mei lpc_ich ipmi_si ipmi_devintf ipmi_msghandler 
acpi_pad mac_hid sch_fq_codel
[   31.519488]  ib_iser rdma_cm iw_cm ib_cm iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c raid0 multipath linear mlx5_ib ib_uverbs ib
_core raid1 hid_generic usbhid hid crct10dif_pclmul crc32_pclmul 
ghash_clmulni_intel ast pcbc ttm drm_kms_helper aesni_intel syscopyarea 
aes_x86_64 sysfillrect mxm_wmi crypto_simd sysimgblt cryptd glue_helper 
fb_sys_fops mlx5_core ixgbe igb mpt3sas drm ahci tls libahci i2c_algo_bit m
lxfw raid_class dca devlink mdio scsi_transport_sas wmi
[   31.578877] CR2: 0000000000000038
[ 31.583249] ---[ end trace c4bada38847a0075 ]---

Upgrading to mainline 4.18.17 seems to solve the issue. It's difficult
to bissect as it doesn't happen often. 4.18.17 contains
c473a489d4098969ffafda913e1ad71da31b1104 (xfrm: Fix NULL pointer
dereference when skb_dst_force clears the dst_entry) but it doesn't
match the stacktrace (stacktrace is input, patch is output and forward).
There is also fdb06c787b34fd397f28f515105627307d615025 (xfrm: Fix NULL
pointer dereference when skb_dst_force clears the dst_entry) which is
also in 4.17 and may better match the problem but I am unsure what it
means to have several transformations (we use VTI interfaces, but other
than that, we don't do anything fancy).

Hardware is Mellanox ConnectX-4 Lx (no ESP offload).

May I suggest upgrade 4.18 to 4.18.17 and to backport these two patches
to Bionic 4.15?

Thanks.

** Affects: linux (Ubuntu)
     Importance: Undecided
         Status: Incomplete


** Tags: cosmic

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1802480

Title:
  Crash when using IPsec VTI interfaces on 4.15 and 4.18.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1802480/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to