Based on the comments above, and the Oracle Linux report that Leif M
found (thanks!) it seems likely that this is an upstream Linux kernel
"stable" backport which (probably in the upstream Linux kernel "stable")
was incomplete.  Which now a bunch of Linux distros (including Ubuntu
and Oracle Linux) have imported.  (I really wish the upstream Linux
kernel "stable" process had a more thorough process than just randomly
cherry picking patches made to later kernels and hoping for the best :-/
)

>From the Oracle Linux report (https://github.com/oracle/linux-
uek/issues/15 -- found by Leif M above) it looks like one of the
triggering factors is that the Linux kernel is doing NAT.

That's also consistent with my experience -- the server with problems
hosts several KVM virtual machines, and has a NAT firewall in front of
them.  But I've not seen similar reports on a laptop or a desktop
running the 149 kernel, which don't normally run any virtual machines or
do NAT themselves.

Some of the patches linked from the Oracle Linux bug report also seem to
suggest that in certain control flows one of the values that's being
relied on to be a pointer actually isn't used/used as a pointer
(presumably hence 0).  Which makes me think "improperly back ported"
(most likely by the Linux kernel "stable" process) is the most likely
cause, and a state check got missed.

Ewen

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-signed in Ubuntu.
https://bugs.launchpad.net/bugs/2018960

Title:
  linux-image-5.4.0-149-generic (regression): 0 at net/core/stream.c:212
  sk_stream_kill_queues+0xcf/0xe0

Status in linux-signed package in Ubuntu:
  Confirmed
Status in linux-signed-kvm package in Ubuntu:
  New

Bug description:
  After upgrading and rebooting this Ubuntu 20.04 LTS server (Ubuntu
  Focal), I noticed that it was suddenly getting a bunch of kernel log
  (dmesg) reports like:

  WARNING: CPU: 4 PID: 0 at net/core/stream.c:212
  sk_stream_kill_queues+0xcf/0xe0

  while investigating I determined that it is currently running the
  focal-proposed kernel (linux-image-5.4.0-149-generic), which it turns
  out was enabled for this server (clearly it seemed like a good idea at
  the time).

  I'm not expecting focal-proposed to be fixed as if it were a release
  package, but since I couldn't find any reports on Launchpad I figured
  I should let y'all know this focal-proposed package could do with some
  additional work before it's actually released :-)

  There have been at least 80 such reports in the last 5 hours since the
  server was rebooted, differing only by the CPU core and the process
  reported, although it seems the last one was a couple of hours ago, so
  I guess it's traffic dependent/timing dependent.

  ewen@naosr620:~$ uptime
   16:27:32 up  5:19,  1 user,  load average: 0.08, 0.14, 0.06
  ewen@naosr620:~$ dmesg -t | grep WARNING | sed 's/CPU: [0-9]*/CPU: N/; s/PID: 
[0-9]*/PID: N/;' | uniq -c
       88 WARNING: CPU: N PID: N at net/core/stream.c:212 
sk_stream_kill_queues+0xcf/0xe0
  ewen@naosr620:~$ 

  Ubuntu Release:

  ewen@naosr620:~$ lsb_release -rd
  Description:  Ubuntu 20.04.6 LTS
  Release:      20.04
  ewen@naosr620:~$ 

  
  Kernel/package version affected:

  ewen@naosr620:~$ uname -a
  Linux naosr620 5.4.0-149-generic #166-Ubuntu SMP Tue Apr 18 16:51:45 UTC 2023 
x86_64 x86_64 x86_64 GNU/Linux
  ewen@naosr620:~$ dpkg -l | grep linux-image | grep 149
  ii  linux-image-5.4.0-149-generic          5.4.0-149.166                      
   amd64        Signed kernel image generic
  ii  linux-image-generic                    5.4.0.149.147                      
   amd64        Generic Linux kernel image
  ewen@naosr620:~$ apt-cache policy linux-image-5.4.0-149-generic 
  linux-image-5.4.0-149-generic:
    Installed: 5.4.0-149.166
    Candidate: 5.4.0-149.166
    Version table:
   *** 5.4.0-149.166 500
          500 https://mirror.fsmg.org.nz/ubuntu focal-proposed/main amd64 
Packages
          100 /var/lib/dpkg/status
  ewen@naosr620:~$ apt-cache policy linux-image-generic
  linux-image-generic:
    Installed: 5.4.0.149.147
    Candidate: 5.4.0.149.147
    Version table:
   *** 5.4.0.149.147 500
          500 https://mirror.fsmg.org.nz/ubuntu focal-proposed/main amd64 
Packages
          100 /var/lib/dpkg/status
       5.4.0.148.146 500
          500 https://mirror.fsmg.org.nz/ubuntu focal-updates/main amd64 
Packages
          500 https://mirror.fsmg.org.nz/ubuntu focal-security/main amd64 
Packages
       5.4.0.26.32 500
          500 https://mirror.fsmg.org.nz/ubuntu focal/main amd64 Packages
  ewen@naosr620:~$ 
  ewen@naosr620:~$ apt-cache show linux-image-5.4.0-149-generic | grep Source:
  Source: linux-signed
  ewen@naosr620:~$ 

  
  Full example dmesg, including stack trace (they all seem to be WARNINGs, and 
other than filling dmesg / system logs the system "appears to be running okay", 
so I'm not going to rush another reboot now -- near end of business day):

  ewen@naosr620:~$ date
  Tue 09 May 2023 16:34:56 NZST
  ewen@naosr620:~$ dmesg -T | tail -100 | grep -B 150 "end trace" | grep -A 999 
"cut here"
  [Tue May  9 14:21:18 2023] ------------[ cut here ]------------
  [Tue May  9 14:21:18 2023] WARNING: CPU: 10 PID: 0 at net/core/stream.c:212 
sk_stream_kill_queues+0xcf/0xe0
  [Tue May  9 14:21:18 2023] Modules linked in: mpt3sas raid_class 
scsi_transport_sas mptctl mptbase vhost_net vhost tap ip6t_REJECT 
nf_reject_ipv6 ip6table_mangle ip6table_nat ip6table_raw nf_log_ipv6 xt_recent 
ipt_REJECT nf_reject_ipv4 xt_hashlimit xt_addrtype xt_multiport xt_comment 
xt_conntrack xt_mark iptable_mangle xt_MASQUERADE iptable_nat xt_CT xt_tcpudp 
iptable_raw nfnetlink_log xt_NFLOG nf_log_ipv4 nf_log_common xt_LOG nf_nat_tftp 
nf_nat_snmp_basic nf_conntrack_snmp nf_nat_sip nf_nat_pptp nf_nat_irc 
ebtable_filter nf_nat_h323 ebtables nf_nat_ftp nf_nat_amanda ts_kmp 
ip6table_filter nf_conntrack_amanda nf_nat ip6_tables nf_conntrack_sane 
nf_conntrack_tftp nf_conntrack_sip nf_conntrack_pptp nf_conntrack_netlink 
nfnetlink nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_irc 
nf_conntrack_h323 nf_conntrack_ftp nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 
iptable_filter bpfilter dell_rbu nls_iso8859_1 ipmi_ssif input_leds joydev 
cdc_ether usbnet mii cdc_acm intel_rapl_ms
 r intel_rapl_common
  [Tue May  9 14:21:18 2023]  sb_edac x86_pkg_temp_thermal intel_powerclamp 
binfmt_misc coretemp dcdbas kvm_intel kvm rapl intel_cstate mei_me mei ipmi_si 
ipmi_devintf mac_hid ipmi_msghandler acpi_power_meter 8021q garp mrp bridge stp 
llc sch_fq_codel ramoops reed_solomon efi_pstore ip_tables x_tables autofs4 
btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq 
async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear 
hid_generic usbhid hid mgag200 drm_vram_helper ttm crct10dif_pclmul 
crc32_pclmul drm_kms_helper ghash_clmulni_intel syscopyarea aesni_intel 
sysfillrect sysimgblt ixgbe fb_sys_fops crypto_simd xfrm_algo cryptd mdio drm 
glue_helper igb megaraid_sas dca i2c_algo_bit ahci libahci lpc_ich wmi
  [Tue May  9 14:21:18 2023] CPU: 10 PID: 0 Comm: swapper/10 Tainted: G        
W         5.4.0-149-generic #166-Ubuntu
  [Tue May  9 14:21:18 2023] Hardware name: Dell Inc. PowerEdge R620/01W23F, 
BIOS 2.7.0 05/23/2018
  [Tue May  9 14:21:18 2023] RIP: 0010:sk_stream_kill_queues+0xcf/0xe0
  [Tue May  9 14:21:18 2023] Code: c0 75 21 85 f6 75 23 5b 41 5c 5d c3 48 89 df 
e8 87 0f ff ff 8b 83 48 01 00 00 8b b3 00 01 00 00 85 c0 74 df 0f 0b 85 f6 74 
dd <0f> 0b 5b 41 5c 5d c3 0f 0b eb a8 66 0f 1f 44 00 00 0f 1f 44 00 00
  [Tue May  9 14:21:18 2023] RSP: 0018:ffffb4044665c9a8 EFLAGS: 00010206
  [Tue May  9 14:21:18 2023] RAX: 0000000000000000 RBX: ffff8cbef55a5f00 RCX: 
000000000002546e
  [Tue May  9 14:21:18 2023] RDX: ffffffffaba54e30 RSI: 0000000000000700 RDI: 
ffff8cbef55a5f00
  [Tue May  9 14:21:18 2023] RBP: ffffb4044665c9b8 R08: 0000000000000000 R09: 
0000000000000000
  [Tue May  9 14:21:18 2023] R10: 0000000000030000 R11: 000000000c000000 R12: 
ffff8cbef55a5fd0
  [Tue May  9 14:21:18 2023] R13: 0000000000000000 R14: 0000000000000006 R15: 
0000000000000079
  [Tue May  9 14:21:18 2023] FS:  0000000000000000(0000) 
GS:ffff8cbf7f740000(0000) knlGS:0000000000000000
  [Tue May  9 14:21:18 2023] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [Tue May  9 14:21:18 2023] CR2: 00000000b5ea06e8 CR3: 00000012be20a002 CR4: 
00000000001626e0
  [Tue May  9 14:21:18 2023] Call Trace:
  [Tue May  9 14:21:18 2023]  <IRQ>
  [Tue May  9 14:21:18 2023]  inet_csk_destroy_sock+0x64/0x150
  [Tue May  9 14:21:18 2023]  tcp_done+0xbc/0x120
  [Tue May  9 14:21:18 2023]  tcp_time_wait+0x1a2/0x2c0
  [Tue May  9 14:21:18 2023]  tcp_fin+0x14e/0x170
  [Tue May  9 14:21:18 2023]  tcp_data_queue+0x437/0x690
  [Tue May  9 14:21:18 2023]  tcp_rcv_state_process+0x267/0x740
  [Tue May  9 14:21:18 2023]  tcp_v6_do_rcv+0x1c5/0x450
  [Tue May  9 14:21:18 2023]  tcp_v6_rcv+0xc2b/0xd10
  [Tue May  9 14:21:18 2023]  ip6_protocol_deliver_rcu+0xd3/0x4e0
  [Tue May  9 14:21:18 2023]  ip6_input_finish+0x15/0x20
  [Tue May  9 14:21:18 2023]  ip6_input+0xa2/0xb0
  [Tue May  9 14:21:18 2023]  ? ip6_protocol_deliver_rcu+0x4e0/0x4e0
  [Tue May  9 14:21:18 2023]  ip6_sublist_rcv_finish+0x3d/0x50
  [Tue May  9 14:21:18 2023]  ip6_sublist_rcv+0x1aa/0x250
  [Tue May  9 14:21:18 2023]  ? ip6_rcv_finish_core.isra.0+0xa0/0xa0
  [Tue May  9 14:21:18 2023]  ipv6_list_rcv+0x112/0x140
  [Tue May  9 14:21:18 2023]  __netif_receive_skb_list_core+0x1a4/0x250
  [Tue May  9 14:21:18 2023]  netif_receive_skb_list_internal+0x1a1/0x2b0
  [Tue May  9 14:21:18 2023]  gro_normal_list.part.0+0x1e/0x40
  [Tue May  9 14:21:18 2023]  napi_complete_done+0x91/0x130
  [Tue May  9 14:21:18 2023]  igb_poll+0x7c/0x350 [igb]
  [Tue May  9 14:21:18 2023]  net_rx_action+0x142/0x390
  [Tue May  9 14:21:18 2023]  __do_softirq+0xd1/0x2c1
  [Tue May  9 14:21:18 2023]  irq_exit+0xae/0xb0
  [Tue May  9 14:21:18 2023]  do_IRQ+0x5a/0xf0
  [Tue May  9 14:21:18 2023]  common_interrupt+0xf/0xf
  [Tue May  9 14:21:18 2023]  </IRQ>
  [Tue May  9 14:21:18 2023] RIP: 0010:cpuidle_enter_state+0xc5/0x450
  [Tue May  9 14:21:18 2023] Code: ff e8 ff da 83 ff 80 7d c7 00 74 17 9c 58 0f 
1f 44 00 00 f6 c4 02 0f 85 65 03 00 00 31 ff e8 72 f0 89 ff fb 66 0f 1f 44 00 
00 <45> 85 ed 0f 88 8f 02 00 00 49 63 cd 4c 8b 7d d0 4c 2b 7d c8 48 8d
  [Tue May  9 14:21:18 2023] RSP: 0018:ffffb4044631be38 EFLAGS: 00000246 
ORIG_RAX: ffffffffffffffde
  [Tue May  9 14:21:18 2023] RAX: ffff8cbf7f76fe80 RBX: ffffffffab95a2e0 RCX: 
000000000000001f
  [Tue May  9 14:21:18 2023] RDX: 0000000000000000 RSI: 00000000313b2510 RDI: 
0000000000000000
  [Tue May  9 14:21:18 2023] RBP: ffffb4044631be78 R08: 00000a8a4770ff97 R09: 
0000000000000001
  [Tue May  9 14:21:18 2023] R10: ffff8cbf7f76eb80 R11: ffff8cbf7f76eb60 R12: 
ffffd3f43f940c00
  [Tue May  9 14:21:18 2023] R13: 0000000000000004 R14: 0000000000000004 R15: 
ffffd3f43f940c00
  [Tue May  9 14:21:18 2023]  ? cpuidle_enter_state+0xa1/0x450
  [Tue May  9 14:21:18 2023]  cpuidle_enter+0x2e/0x40
  [Tue May  9 14:21:18 2023]  call_cpuidle+0x23/0x40
  [Tue May  9 14:21:18 2023]  do_idle+0x1dd/0x270
  [Tue May  9 14:21:18 2023]  cpu_startup_entry+0x20/0x30
  [Tue May  9 14:21:18 2023]  start_secondary+0x173/0x1d0
  [Tue May  9 14:21:18 2023]  secondary_startup_64+0xa4/0xb0
  [Tue May  9 14:21:18 2023] ---[ end trace 55353b04455fda06 ]---
  ewen@naosr620:~$ 

  (If it gets bad I'll do something about rebooting it onto the latest
  actually released kernel image; if it's fairly quiet I might leave it
  until the next focal-proposed kernel package comes out)

  FTR, this is a fairly production, remote (colo) server so my
  willingness to try expeimental kernel versions to "see if this is the
  problem" is low.  But if there's more information I can gather from
  the running kernel that would help, let me know and I'll try to get it
  for you.

  Also FTR the server hadn't been rebooted for a while, so I'm pretty
  sure it was running linux-image-5.4.0-145-generic before the reboot
  (without these dmesg reports).

  Ewen

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-signed/+bug/2018960/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to