[Kernel-packages] [Bug 2061091] Re: Freezing user space processes failed after 20.008 seconds (1 tasks refusing to freeze, wq_busy=0)
Thanks for the update, could you bisect which commit fixes the issue since upstream 6.8 works? https://git-scm.com/docs/git-bisect -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-signed-hwe-6.5 in Ubuntu. https://bugs.launchpad.net/bugs/2061091 Title: Freezing user space processes failed after 20.008 seconds (1 tasks refusing to freeze, wq_busy=0) Status in linux-signed-hwe-6.5 package in Ubuntu: New Bug description: Sometimes, when trying to shut down or suspend my laptop, it gets stuck on the console screen. If I was suspending, it eventually gives up and goes back to the X session. During a shutdown it hangs forever and the only solution seems to be to force a reboot with Magic-SysRq. The following appears in `kern.log`: ``` Apr 12 08:59:54 xps9320 kernel: [173172.510341] Freezing user space processes failed after 20.008 seconds (1 tasks refusing to freeze, wq_busy=0): Apr 12 08:59:54 xps9320 kernel: [173172.515669] task:wireplumber state:D stack:0 pid:2408 ppid:2398 flags:0x0006 Apr 12 08:59:54 xps9320 kernel: [173172.518923] Call Trace: Apr 12 08:59:54 xps9320 kernel: [173172.521755] Apr 12 08:59:54 xps9320 kernel: [173172.524099] __schedule+0x2cb/0x750 Apr 12 08:59:54 xps9320 kernel: [173172.526333] schedule+0x63/0x110 Apr 12 08:59:54 xps9320 kernel: [173172.528585] snd_power_ref_and_wait+0xe5/0x140 [snd] Apr 12 08:59:54 xps9320 kernel: [173172.530825] ? __pfx_autoremove_wake_function+0x10/0x10 Apr 12 08:59:54 xps9320 kernel: [173172.533103] snd_ctl_elem_info+0x4f/0x1b0 [snd] Apr 12 08:59:54 xps9320 kernel: [173172.535354] snd_ctl_elem_info_user+0x59/0xc0 [snd] Apr 12 08:59:54 xps9320 kernel: [173172.537598] snd_ctl_ioctl+0x1d4/0x650 [snd] Apr 12 08:59:54 xps9320 kernel: [173172.539846] ? __fget_light+0xa5/0x120 Apr 12 08:59:54 xps9320 kernel: [173172.542082] __x64_sys_ioctl+0xa0/0xf0 Apr 12 08:59:54 xps9320 kernel: [173172.544334] do_syscall_64+0x58/0x90 Apr 12 08:59:54 xps9320 kernel: [173172.546566] ? syscall_exit_to_user_mode+0x37/0x60 Apr 12 08:59:54 xps9320 kernel: [173172.548838] ? do_syscall_64+0x67/0x90 Apr 12 08:59:54 xps9320 kernel: [173172.551085] ? do_syscall_64+0x67/0x90 Apr 12 08:59:54 xps9320 kernel: [173172.553342] ? syscall_exit_to_user_mode+0x37/0x60 Apr 12 08:59:54 xps9320 kernel: [173172.96] ? do_syscall_64+0x67/0x90 Apr 12 08:59:54 xps9320 kernel: [173172.557828] ? common_interrupt+0x54/0xb0 Apr 12 08:59:54 xps9320 kernel: [173172.560075] entry_SYSCALL_64_after_hwframe+0x6e/0xd8 Apr 12 08:59:54 xps9320 kernel: [173172.562321] RIP: 0033:0x70f3adb1a94f Apr 12 08:59:54 xps9320 kernel: [173172.564650] RSP: 002b:7ffef2072940 EFLAGS: 0246 ORIG_RAX: 0010 Apr 12 08:59:54 xps9320 kernel: [173172.566925] RAX: ffda RBX: 7ffef20729b0 RCX: 70f3adb1a94f Apr 12 08:59:54 xps9320 kernel: [173172.569222] RDX: 7ffef20729b0 RSI: c1105511 RDI: 0022 Apr 12 08:59:54 xps9320 kernel: [173172.571501] RBP: 7ffef2072b90 R08: 65107d2b6070 R09: 0004 Apr 12 08:59:54 xps9320 kernel: [173172.573762] R10: f014 R11: 0246 R12: 65107d2ee100 Apr 12 08:59:54 xps9320 kernel: [173172.576047] R13: 65107d1fb400 R14: 7ffef2072b30 R15: 7ffef2072ad0 Apr 12 08:59:54 xps9320 kernel: [173172.578308] ``` Another point of interest is that, when in this situation: * `lsusb` hangs after printing out a few lines * Outgoing SSH connections hang unless I clear SSH_AUTH_SOCK. gpg-agent appears to be trying to check for smartcards. So I guess there is some sort of deadlock in the USB subsystem which is causing all the other problems? ProblemType: Bug DistroRelease: Ubuntu 22.04 Package: linux-image-6.5.0-27-generic 6.5.0-27.28~22.04.1 ProcVersionSignature: Ubuntu 6.5.0-27.28~22.04.1-generic 6.5.13 Uname: Linux 6.5.0-27-generic x86_64 ApportVersion: 2.20.11-0ubuntu82.5 Architecture: amd64 CasperMD5CheckResult: pass CurrentDesktop: ubuntu:GNOME Date: Fri Apr 12 09:42:52 2024 InstallationDate: Installed on 2022-06-28 (653 days ago) InstallationMedia: Xubuntu 22.04 LTS "Jammy Jellyfish" - Release amd64 (20220419) SourcePackage: linux-signed-hwe-6.5 UpgradeStatus: No upgrade log present (probably fresh install) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-signed-hwe-6.5/+bug/2061091/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2042363] Re: AIX 7.3 NFS client frequently returns an EIO error to an application when reading or writing to a file that has been locked with fcntl() on a Ubuntu 20.04 NFSV4 ser
Per comment#23, the ip from AIX 7.2 client are: 9.20.120.127 name = adia.v6.hursley.ibm.com -- Primary 9.20.121.46 name = amberjack.v6.hursley.ibm.com ? Partner And I searched the trace again with above ips, looks socket cc6f0db2 is created between 9.20.120.127 and nfs server, however it can also return EAGAIN. duckseason kernel: [13254.724411] svc: socket cc6f0db2 sendto([8485f39d 72... ], 72) = 72 (addr 9.20.120.127, port=1022) ... duckseason kernel: [13254.724734] svc: socket cc6f0db2(inet c831762e), busy=0 duckseason kernel: [13254.724759] svc: server 728e82a2, pool 0, transport cc6f0db2, inuse=2 duckseason kernel: [13254.724761] svc: tcp_recv cc6f0db2 data 1 conn 0 close 0 duckseason kernel: [13254.724765] svc: socket cc6f0db2 recvfrom(b6708704, 4) = 4 duckseason kernel: [13254.724766] svc: TCP record, 168 bytes duckseason kernel: [13254.724769] svc: socket cc6f0db2 recvfrom(57dbced3, 4096) = 168 duckseason kernel: [13254.724771] svc: TCP final record (168 bytes) duckseason kernel: [13254.724775] svc: svc_authenticate (1) duckseason kernel: [13254.724779] svc: server ee62a401, pool 0, transport cc6f0db2, inuse=3 duckseason kernel: [13254.724780] svc: tcp_recv cc6f0db2 data 1 conn 0 close 0 duckseason kernel: [13254.724783] svc: socket cc6f0db2 recvfrom(b6708704, 4) = -11 And it is same for socket 3497acd5 which is used between 9.20.121.46 and nfs server. duckseason kernel: [13254.802249] svc: socket 3497acd5 sendto([86e5a045 72... ], 72) = 72 (addr 9.20.121.46, port=1020) ... duckseason kernel: [13254.802533] svc: socket 3497acd5(inet 72c9551d), busy=0 duckseason kernel: [13254.802571] svc: server 728e82a2, pool 0, transport 3497acd5, inuse=2 duckseason kernel: [13254.802573] svc: tcp_recv 3497acd5 data 1 conn 0 close 0 duckseason kernel: [13254.802578] svc: socket 3497acd5 recvfrom(77f9cf7c, 4) = 4 duckseason kernel: [13254.802579] svc: TCP record, 164 bytes duckseason kernel: [13254.802583] svc: socket 3497acd5 recvfrom(57dbced3, 4096) = 164 duckseason kernel: [13254.802585] svc: TCP final record (164 bytes) duckseason kernel: [13254.802590] svc: svc_authenticate (1) duckseason kernel: [13254.802596] svc: server ee62a401, pool 0, transport 3497acd5, inuse=3 duckseason kernel: [13254.802597] svc: tcp_recv 3497acd5 data 1 conn 0 close 0 duckseason kernel: [13254.802599] svc: socket 3497acd5 recvfrom(77f9cf7c, 4) = -11 But since aix 7.2 client can work with the same server according to bug description, I am curious why 7.2 client also return EAGAIN which is same as 7.3 client, what am I missing? Some questions/suggestion: 1. Did aix 7.3 nfs client work with previous kernel? If so, run "git bisect" to find which commit caused the issue. 2. Is it possible to try with latest 5.4 stable kernel as suggested in comment#1? Also try latest upstream kernel (6.9-rc5 at this time) as well. 3. Does increase lease time make difference? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2042363 Title: AIX 7.3 NFS client frequently returns an EIO error to an application when reading or writing to a file that has been locked with fcntl() on a Ubuntu 20.04 NFSV4 server Status in linux package in Ubuntu: New Bug description: ---Problem Description--- AIX 7.3 NFS client frequently returns an EIO error to an application when reading or writing to a file that has been locked with fcntl(). NFS server is Ubuntu 20.04.6 LTS, GNU/Linux 5.4.0-139-generic x86_64. The problem does not appear to affect other combinations of NFS client (including AIX 7.2) with this NFS server. The AIX team have indicated that the cause of the EIO is triggered by the NFS server returning a BAD_SEQID error which leads to the AIX NFS client incorrectly zeroing the stateid, which then leads to the NFS server returning a BAD_STATEID error and the NFS client then returns the EIO error. The AIX team would like to understand why the BAD_SEQID has been returned. ---uname output--- Linux duckseason 5.4.0-156-generic #173-Ubuntu SMP Tue Jul 11 07:25:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux Machine Type = VMware ESXi Server 7.0 4 x Intel(R) Xeon(R) Gold 6348H CPU @ 2.30GHz ---Steps to Reproduce--- We cannot offer a simple way to recreate the problem as it involves IBM MQ running on two primary machines (AIX) using the Ubuntu server for it's HA NFSv4 storage. However, we can provide any requested trace or dumps from any or all of the involved machines. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2042363/+subscriptions -- Mailing list:
[Kernel-packages] [Bug 2051232] Re: kernel: BUG: Bad page state in process kworker
Could you try the latest upstream kernel which convert dm-crypt's tasklet to BH workqueue? I suppose the commit fb6ad4aec1d0 ("dm-crypt: Convert from tasklet to BH workqueue") might resolve the issue. And mantic master-next has disabled tasklets for dm-crypt. https://git.launchpad.net/~ubuntu- kernel/ubuntu/+source/linux/+git/mantic/commit/?h=master- next=13104eddc76990dc3e4183cff050c9b6dc5e859e I suppose hwe-6.5 will sync from mantic later, so please try with the newer kernel. BTW, could you share how to reproduce the issue? I can try from my side in case above commits doesn't help. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-hwe-6.5 in Ubuntu. https://bugs.launchpad.net/bugs/2051232 Title: kernel: BUG: Bad page state in process kworker Status in linux-hwe-6.5 package in Ubuntu: Confirmed Bug description: Similar to the bug https://bugs.launchpad.net/ubuntu/+source/linux- hwe-6.5/+bug/2051123 where traces were shown, we observed a "BUG" being reported on yet another machine of the same make / model (Asus RS720A-E11-RS24U using dual socket AMD EPYC Milan CPUs): ``` [...] Jan 24 08:57:00 fra-az1-comp-24 kernel: BUG: Bad page state in process kworker/u257:18 pfn:5812dc Jan 24 08:57:00 fra-az1-comp-24 kernel: page:b0c63dd1 refcount:-1 mapcount:0 mapping: index:0x0 pfn:0x5812dc Jan 24 08:57:00 fra-az1-comp-24 kernel: flags: 0x17c000(node=0|zone=2|lastcpupid=0x1f) Jan 24 08:57:00 fra-az1-comp-24 kernel: page_type: 0x() Jan 24 08:57:00 fra-az1-comp-24 kernel: raw: 0017c000 dead0100 dead0122 Jan 24 08:57:00 fra-az1-comp-24 kernel: raw: Jan 24 08:57:00 fra-az1-comp-24 kernel: page dumped because: nonzero _refcount Jan 24 08:57:00 fra-az1-comp-24 kernel: Modules linked in: vxlan ip6_udp_tunnel udp_tunnel ebt_arp nft_meta_bridge xt_CT xt_mac xt_state xt_comment xt_physdev vhost_net vhost vhost_iotlb tap xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink xfrm_user xfrm_algo nvme_fabrics 8021q garp mrp br_netfilter bridge stp llc bonding binfmt_misc tls nls_ascii ipmi_ssif intel_rapl_msr intel_rapl_ common amd64_edac edac_mce_amd kvm_amd kvm irqbypass rapl wmi_bmof irdma ib_uverbs ib_core joydev input_leds ccp k10temp ptdma switchtec acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler mac_hid sch_fq_codel efi_pstore ip_tables x_tables autofs4 dm_crypt raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid0 multipath linear hid_generic usbhid hid cdc_ether usbnet mii ast i2c_algo_bit drm_shmem_helper raid1 drm_kms_helper ice crct10dif_pclmul crc32_pclmul po lyval_clmulni polyval_generic Jan 24 08:57:00 fra-az1-comp-24 kernel: ghash_clmulni_intel aesni_intel crypto_simd cryptd ahci nvme gnss drm i40e libahci xhci_pci i2c_piix4 xhci_pci_renesas nvme_core nvme_common wmi Jan 24 08:57:00 fra-az1-comp-24 kernel: CPU: 14 PID: 1094271 Comm: kworker/u257:18 Not tainted 6.5.0-14-generic #14~22.04.1-Ubuntu Jan 24 08:57:00 fra-az1-comp-24 kernel: Hardware name: To be filled by O.E.M. To be filled by O.E.M./KMPP-D32 Series, BIOS 1501 08/23/2023 Jan 24 08:57:00 fra-az1-comp-24 kernel: Workqueue: kcryptd/252:12 kcryptd_crypt [dm_crypt] Jan 24 08:57:00 fra-az1-comp-24 kernel: Call Trace: Jan 24 08:57:00 fra-az1-comp-24 kernel: Jan 24 08:57:00 fra-az1-comp-24 kernel: dump_stack_lvl+0x48/0x70 Jan 24 08:57:00 fra-az1-comp-24 kernel: dump_stack+0x10/0x20 Jan 24 08:57:00 fra-az1-comp-24 kernel: bad_page+0x76/0x120 Jan 24 08:57:00 fra-az1-comp-24 kernel: __rmqueue_pcplist+0x149/0x1d0 Jan 24 08:57:00 fra-az1-comp-24 kernel: ? srso_alias_return_thunk+0x5/0x7f Jan 24 08:57:00 fra-az1-comp-24 kernel: rmqueue+0x37c/0xf10 Jan 24 08:57:00 fra-az1-comp-24 kernel: get_page_from_freelist+0x10b/0x4c0 Jan 24 08:57:00 fra-az1-comp-24 kernel: ? srso_alias_return_thunk+0x5/0x7f Jan 24 08:57:00 fra-az1-comp-24 kernel: __alloc_pages+0x1e7/0x350 Jan 24 08:57:00 fra-az1-comp-24 kernel: alloc_pages+0x90/0x1a0 Jan 24 08:57:00 fra-az1-comp-24 kernel: crypt_page_alloc+0x2f/0x70 [dm_crypt] Jan 24 08:57:00 fra-az1-comp-24 kernel: mempool_alloc+0x83/0x1c0 Jan 24 08:57:00 fra-az1-comp-24 kernel: ? srso_alias_return_thunk+0x5/0x7f Jan 24 08:57:00 fra-az1-comp-24 kernel: crypt_alloc_buffer+0x11a/0x1f0 [dm_crypt] Jan 24 08:57:00 fra-az1-comp-24 kernel: kcryptd_crypt_write_convert+0xa3/0x1d0 [dm_crypt] Jan 24 08:57:00 fra-az1-comp-24 kernel: kcryptd_crypt+0x114/0x170 [dm_crypt] Jan 24 08:57:00 fra-az1-comp-24 kernel: process_one_work+0x240/0x450 Jan 24 08:57:00 fra-az1-comp-24 kernel: worker_thread+0x50/0x3f0 Jan 24 08:57:00 fra-az1-comp-24 kernel: ?
[Kernel-packages] [Bug 2051123] Re: Kernel traces leading to crash - refcount_t: underflow; use-after-free and refcount_t: saturated; leaking memory -- lib/refcount.c
Err, the comments (#5 and #6) are for lp#2051232, sorry for confusion! -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-hwe-6.5 in Ubuntu. https://bugs.launchpad.net/bugs/2051123 Title: Kernel traces leading to crash - refcount_t: underflow; use-after-free and refcount_t: saturated; leaking memory -- lib/refcount.c Status in linux-hwe-6.5 package in Ubuntu: Confirmed Bug description: A few hours after upgrading a machine serving as VM hypervisor running OpenStack Nova + libvirt from linux kernel 6.2.0-37-generic to 6.5.0-14-generic we observed kernel traces and quick disintegration of the system and its various processes. While the TCP connection itself was accepted, we were unable to log in via SSH anymore or use the console. A hard reset was required to get the machine back up. We went back to the former HWE kernel version, 6.2.0-37-generic, and have not observed any issues since. Attached is all of the kernel log from bootup to the crash - this is where the issues started ... ``` [...] Jan 23 11:36:13 fra-az1-comp-21 kernel: vxlan-304: fa:16:3e:7f:e2:6f migrated from 10.101.11.98 to 10.101.11.101 Jan 23 11:41:05 fra-az1-comp-21 kernel: hrtimer: interrupt took 32482 ns Jan 23 12:02:41 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on CPU50: hpet wd-wd read-back delay of 245561ns Jan 23 12:02:41 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back delay of 245561ns, clock-skew test skipped! Jan 23 12:18:18 fra-az1-comp-21 kernel: perf: interrupt took too long (2509 > 2500), lowering kernel.perf_event_max_sample_rate to 79500 Jan 23 12:44:35 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on CPU127: hpet wd-wd read-back delay of 244863ns Jan 23 12:44:35 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back delay of 245352ns, clock-skew test skipped! Jan 23 13:08:56 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on CPU16: hpet wd-wd read-back delay of 243257ns Jan 23 13:08:56 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back delay of 247517ns, clock-skew test skipped! Jan 23 14:13:58 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on CPU52: hpet wd-wd read-back delay of 248076ns Jan 23 14:13:58 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back delay of 245142ns, clock-skew test skipped! Jan 23 14:31:18 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on CPU124: hpet wd-wd read-back delay of 245073ns Jan 23 14:31:18 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back delay of 231034ns, clock-skew test skipped! Jan 23 15:13:52 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on CPU24: hpet wd-wd read-back delay of 244863ns Jan 23 15:13:52 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back delay of 245701ns, clock-skew test skipped! Jan 23 15:35:18 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on CPU76: hpet wd-wd read-back delay of 245282ns Jan 23 15:35:18 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back delay of 245841ns, clock-skew test skipped! Jan 23 16:10:49 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on CPU27: hpet wd-wd read-back delay of 244653ns Jan 23 16:10:49 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back delay of 245980ns, clock-skew test skipped! Jan 23 16:12:49 fra-az1-comp-21 kernel: workqueue: drain_vmap_area_work hogged CPU for >1us 4 times, consider switching to WQ_UNBOUND Jan 23 16:20:56 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on CPU0: hpet wd-wd read-back delay of 242907ns Jan 23 16:20:56 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back delay of 247796ns, clock-skew test skipped! Jan 23 16:25:04 fra-az1-comp-21 kernel: [ cut here ] Jan 23 16:25:04 fra-az1-comp-21 kernel: refcount_t: underflow; use-after-free. Jan 23 16:25:04 fra-az1-comp-21 kernel: WARNING: CPU: 84 PID: 7072 at lib/refcount.c:28 refcount_warn_saturate+0xa3/0x150 Jan 23 16:25:04 fra-az1-comp-21 kernel: Modules linked in: xt_multiport ebt_arp nft_meta_bridge xt_CT xt_mac xt_set xt_state ip_set_hash_net ip_set vhost_net vhost vhost_iotlb tap xt_policy xt_REDIRECT xt_nat xt_connmark xt_mark vxlan ip6_udp_tunnel udp_tunnel xt_comment xt_physdev veth xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink xfrm_user xfrm_algo nvme_fabrics 8021q garp mrp br_netfilter bridge stp llc bonding binfmt_misc tls nls_ascii ipmi_ssif intel_rapl_msr intel_rapl_common amd64_edac edac_mce_amd kvm_amd kvm irqbypass rapl wmi_bmof irdma ib_uverbs ib_core joydev input_leds ccp k10temp ptdma switchtec acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler mac_hid sch_fq_codel efi_pstore ip_tables x_tables autofs4 dm_crypt raid10 raid456 async_raid6_recov async_memcpy
[Kernel-packages] [Bug 2051123] Re: Kernel traces leading to crash - refcount_t: underflow; use-after-free and refcount_t: saturated; leaking memory -- lib/refcount.c
Just noticed master-next has disabled tasklets for dm-crypt. https://git.launchpad.net/~ubuntu- kernel/ubuntu/+source/linux/+git/mantic/commit/?h=master- next=13104eddc76990dc3e4183cff050c9b6dc5e859e I suppose hwe-6.5 will sync from mantic later, so please try with the newer kernel. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-hwe-6.5 in Ubuntu. https://bugs.launchpad.net/bugs/2051123 Title: Kernel traces leading to crash - refcount_t: underflow; use-after-free and refcount_t: saturated; leaking memory -- lib/refcount.c Status in linux-hwe-6.5 package in Ubuntu: Confirmed Bug description: A few hours after upgrading a machine serving as VM hypervisor running OpenStack Nova + libvirt from linux kernel 6.2.0-37-generic to 6.5.0-14-generic we observed kernel traces and quick disintegration of the system and its various processes. While the TCP connection itself was accepted, we were unable to log in via SSH anymore or use the console. A hard reset was required to get the machine back up. We went back to the former HWE kernel version, 6.2.0-37-generic, and have not observed any issues since. Attached is all of the kernel log from bootup to the crash - this is where the issues started ... ``` [...] Jan 23 11:36:13 fra-az1-comp-21 kernel: vxlan-304: fa:16:3e:7f:e2:6f migrated from 10.101.11.98 to 10.101.11.101 Jan 23 11:41:05 fra-az1-comp-21 kernel: hrtimer: interrupt took 32482 ns Jan 23 12:02:41 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on CPU50: hpet wd-wd read-back delay of 245561ns Jan 23 12:02:41 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back delay of 245561ns, clock-skew test skipped! Jan 23 12:18:18 fra-az1-comp-21 kernel: perf: interrupt took too long (2509 > 2500), lowering kernel.perf_event_max_sample_rate to 79500 Jan 23 12:44:35 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on CPU127: hpet wd-wd read-back delay of 244863ns Jan 23 12:44:35 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back delay of 245352ns, clock-skew test skipped! Jan 23 13:08:56 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on CPU16: hpet wd-wd read-back delay of 243257ns Jan 23 13:08:56 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back delay of 247517ns, clock-skew test skipped! Jan 23 14:13:58 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on CPU52: hpet wd-wd read-back delay of 248076ns Jan 23 14:13:58 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back delay of 245142ns, clock-skew test skipped! Jan 23 14:31:18 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on CPU124: hpet wd-wd read-back delay of 245073ns Jan 23 14:31:18 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back delay of 231034ns, clock-skew test skipped! Jan 23 15:13:52 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on CPU24: hpet wd-wd read-back delay of 244863ns Jan 23 15:13:52 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back delay of 245701ns, clock-skew test skipped! Jan 23 15:35:18 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on CPU76: hpet wd-wd read-back delay of 245282ns Jan 23 15:35:18 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back delay of 245841ns, clock-skew test skipped! Jan 23 16:10:49 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on CPU27: hpet wd-wd read-back delay of 244653ns Jan 23 16:10:49 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back delay of 245980ns, clock-skew test skipped! Jan 23 16:12:49 fra-az1-comp-21 kernel: workqueue: drain_vmap_area_work hogged CPU for >1us 4 times, consider switching to WQ_UNBOUND Jan 23 16:20:56 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on CPU0: hpet wd-wd read-back delay of 242907ns Jan 23 16:20:56 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back delay of 247796ns, clock-skew test skipped! Jan 23 16:25:04 fra-az1-comp-21 kernel: [ cut here ] Jan 23 16:25:04 fra-az1-comp-21 kernel: refcount_t: underflow; use-after-free. Jan 23 16:25:04 fra-az1-comp-21 kernel: WARNING: CPU: 84 PID: 7072 at lib/refcount.c:28 refcount_warn_saturate+0xa3/0x150 Jan 23 16:25:04 fra-az1-comp-21 kernel: Modules linked in: xt_multiport ebt_arp nft_meta_bridge xt_CT xt_mac xt_set xt_state ip_set_hash_net ip_set vhost_net vhost vhost_iotlb tap xt_policy xt_REDIRECT xt_nat xt_connmark xt_mark vxlan ip6_udp_tunnel udp_tunnel xt_comment xt_physdev veth xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink xfrm_user xfrm_algo nvme_fabrics 8021q garp mrp br_netfilter bridge stp llc bonding binfmt_misc tls nls_ascii ipmi_ssif intel_rapl_msr intel_rapl_common amd64_edac edac_mce_amd kvm_amd kvm irqbypass rapl wmi_bmof irdma ib_uverbs
[Kernel-packages] [Bug 2051123] Re: Kernel traces leading to crash - refcount_t: underflow; use-after-free and refcount_t: saturated; leaking memory -- lib/refcount.c
Could you try the latest upstream kernel which convert dm-crypt's tasklet to BH workqueue? I suppose the commit fb6ad4aec1d0 ("dm-crypt: Convert from tasklet to BH workqueue") might resolve the issue. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-hwe-6.5 in Ubuntu. https://bugs.launchpad.net/bugs/2051123 Title: Kernel traces leading to crash - refcount_t: underflow; use-after-free and refcount_t: saturated; leaking memory -- lib/refcount.c Status in linux-hwe-6.5 package in Ubuntu: Confirmed Bug description: A few hours after upgrading a machine serving as VM hypervisor running OpenStack Nova + libvirt from linux kernel 6.2.0-37-generic to 6.5.0-14-generic we observed kernel traces and quick disintegration of the system and its various processes. While the TCP connection itself was accepted, we were unable to log in via SSH anymore or use the console. A hard reset was required to get the machine back up. We went back to the former HWE kernel version, 6.2.0-37-generic, and have not observed any issues since. Attached is all of the kernel log from bootup to the crash - this is where the issues started ... ``` [...] Jan 23 11:36:13 fra-az1-comp-21 kernel: vxlan-304: fa:16:3e:7f:e2:6f migrated from 10.101.11.98 to 10.101.11.101 Jan 23 11:41:05 fra-az1-comp-21 kernel: hrtimer: interrupt took 32482 ns Jan 23 12:02:41 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on CPU50: hpet wd-wd read-back delay of 245561ns Jan 23 12:02:41 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back delay of 245561ns, clock-skew test skipped! Jan 23 12:18:18 fra-az1-comp-21 kernel: perf: interrupt took too long (2509 > 2500), lowering kernel.perf_event_max_sample_rate to 79500 Jan 23 12:44:35 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on CPU127: hpet wd-wd read-back delay of 244863ns Jan 23 12:44:35 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back delay of 245352ns, clock-skew test skipped! Jan 23 13:08:56 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on CPU16: hpet wd-wd read-back delay of 243257ns Jan 23 13:08:56 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back delay of 247517ns, clock-skew test skipped! Jan 23 14:13:58 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on CPU52: hpet wd-wd read-back delay of 248076ns Jan 23 14:13:58 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back delay of 245142ns, clock-skew test skipped! Jan 23 14:31:18 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on CPU124: hpet wd-wd read-back delay of 245073ns Jan 23 14:31:18 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back delay of 231034ns, clock-skew test skipped! Jan 23 15:13:52 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on CPU24: hpet wd-wd read-back delay of 244863ns Jan 23 15:13:52 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back delay of 245701ns, clock-skew test skipped! Jan 23 15:35:18 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on CPU76: hpet wd-wd read-back delay of 245282ns Jan 23 15:35:18 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back delay of 245841ns, clock-skew test skipped! Jan 23 16:10:49 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on CPU27: hpet wd-wd read-back delay of 244653ns Jan 23 16:10:49 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back delay of 245980ns, clock-skew test skipped! Jan 23 16:12:49 fra-az1-comp-21 kernel: workqueue: drain_vmap_area_work hogged CPU for >1us 4 times, consider switching to WQ_UNBOUND Jan 23 16:20:56 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on CPU0: hpet wd-wd read-back delay of 242907ns Jan 23 16:20:56 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back delay of 247796ns, clock-skew test skipped! Jan 23 16:25:04 fra-az1-comp-21 kernel: [ cut here ] Jan 23 16:25:04 fra-az1-comp-21 kernel: refcount_t: underflow; use-after-free. Jan 23 16:25:04 fra-az1-comp-21 kernel: WARNING: CPU: 84 PID: 7072 at lib/refcount.c:28 refcount_warn_saturate+0xa3/0x150 Jan 23 16:25:04 fra-az1-comp-21 kernel: Modules linked in: xt_multiport ebt_arp nft_meta_bridge xt_CT xt_mac xt_set xt_state ip_set_hash_net ip_set vhost_net vhost vhost_iotlb tap xt_policy xt_REDIRECT xt_nat xt_connmark xt_mark vxlan ip6_udp_tunnel udp_tunnel xt_comment xt_physdev veth xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink xfrm_user xfrm_algo nvme_fabrics 8021q garp mrp br_netfilter bridge stp llc bonding binfmt_misc tls nls_ascii ipmi_ssif intel_rapl_msr intel_rapl_common amd64_edac edac_mce_amd kvm_amd kvm irqbypass rapl wmi_bmof irdma ib_uverbs ib_core joydev input_leds ccp k10temp ptdma switchtec acpi_ipmi ipmi_si ipmi_devintf
[Kernel-packages] [Bug 2042363] Re: AIX 7.3 NFS client frequently returns an EIO error to an application when reading or writing to a file that has been locked with fcntl() on a Ubuntu 20.04 NFSV4 ser
** Attachment added: "RENEW packets between 9.20.32.85 (server) and 9.20.120.127 (7.2 client)" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2042363/+attachment/5767206/+files/7.2nfs.png -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2042363 Title: AIX 7.3 NFS client frequently returns an EIO error to an application when reading or writing to a file that has been locked with fcntl() on a Ubuntu 20.04 NFSV4 server Status in linux package in Ubuntu: New Bug description: ---Problem Description--- AIX 7.3 NFS client frequently returns an EIO error to an application when reading or writing to a file that has been locked with fcntl(). NFS server is Ubuntu 20.04.6 LTS, GNU/Linux 5.4.0-139-generic x86_64. The problem does not appear to affect other combinations of NFS client (including AIX 7.2) with this NFS server. The AIX team have indicated that the cause of the EIO is triggered by the NFS server returning a BAD_SEQID error which leads to the AIX NFS client incorrectly zeroing the stateid, which then leads to the NFS server returning a BAD_STATEID error and the NFS client then returns the EIO error. The AIX team would like to understand why the BAD_SEQID has been returned. ---uname output--- Linux duckseason 5.4.0-156-generic #173-Ubuntu SMP Tue Jul 11 07:25:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux Machine Type = VMware ESXi Server 7.0 4 x Intel(R) Xeon(R) Gold 6348H CPU @ 2.30GHz ---Steps to Reproduce--- We cannot offer a simple way to recreate the problem as it involves IBM MQ running on two primary machines (AIX) using the Ubuntu server for it's HA NFSv4 storage. However, we can provide any requested trace or dumps from any or all of the involved machines. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2042363/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2042363] Re: AIX 7.3 NFS client frequently returns an EIO error to an application when reading or writing to a file that has been locked with fcntl() on a Ubuntu 20.04 NFSV4 ser
** Attachment added: "packets for 9.20.32.85 (server) and 9.20.120.112 (7.3 client)" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2042363/+attachment/5767207/+files/7.3nfs.png -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2042363 Title: AIX 7.3 NFS client frequently returns an EIO error to an application when reading or writing to a file that has been locked with fcntl() on a Ubuntu 20.04 NFSV4 server Status in linux package in Ubuntu: New Bug description: ---Problem Description--- AIX 7.3 NFS client frequently returns an EIO error to an application when reading or writing to a file that has been locked with fcntl(). NFS server is Ubuntu 20.04.6 LTS, GNU/Linux 5.4.0-139-generic x86_64. The problem does not appear to affect other combinations of NFS client (including AIX 7.2) with this NFS server. The AIX team have indicated that the cause of the EIO is triggered by the NFS server returning a BAD_SEQID error which leads to the AIX NFS client incorrectly zeroing the stateid, which then leads to the NFS server returning a BAD_STATEID error and the NFS client then returns the EIO error. The AIX team would like to understand why the BAD_SEQID has been returned. ---uname output--- Linux duckseason 5.4.0-156-generic #173-Ubuntu SMP Tue Jul 11 07:25:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux Machine Type = VMware ESXi Server 7.0 4 x Intel(R) Xeon(R) Gold 6348H CPU @ 2.30GHz ---Steps to Reproduce--- We cannot offer a simple way to recreate the problem as it involves IBM MQ running on two primary machines (AIX) using the Ubuntu server for it's HA NFSv4 storage. However, we can provide any requested trace or dumps from any or all of the involved machines. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2042363/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2042363] Re: AIX 7.3 NFS client frequently returns an EIO error to an application when reading or writing to a file that has been locked with fcntl() on a Ubuntu 20.04 NFSV4 ser
Sorry, I can't distinguish which parts of logs in the attachments (#comment11, #comment12 and #comment13) are belong to the connection from working 7.2 and non-working 7.3. All the attachments have "TCP recvfrom got EAGAIN" which should from the connection for 7.3. $ grep "TCP recvfrom got EAGAIN" syslog_16042024_amaliada_primary_adamsongrunter_partner_both_aix73_part1.log -r|wc -l 213127 $ grep "TCP recvfrom got EAGAIN" syslog_16042024_amaliada_primary_adamsongrunter_partner_both_aix73_part2.log -r|wc -l 226005 $ grep "TCP recvfrom got EAGAIN" syslog_17042024_adia_primary_amberjack_partner_both_aix72.log -r|wc -l 20233 May I suggest to collect those logs in two separated files? One from 7.2 and another from 7.3 instead of mix them together. Not an network expert, but I see some NFS RENEW ops packets between 9.20.32.85 (server) and 9.20.120.127 (7.2 client) in tcp_dump17_04_2024_09H_10M, but no such RENEW packets for 9.20.32.85 (server) and 9.20.120.112 (7.3 client) in tcpdump16_04_2024_14H_03M. Given NFS4 is a stateful fs which is based on leases, without client send an operation to renew the lease, it is possible for server to return EAGAIN. And please check if 7.3 client is not same as 7.2 client regarding lease renewing. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2042363 Title: AIX 7.3 NFS client frequently returns an EIO error to an application when reading or writing to a file that has been locked with fcntl() on a Ubuntu 20.04 NFSV4 server Status in linux package in Ubuntu: New Bug description: ---Problem Description--- AIX 7.3 NFS client frequently returns an EIO error to an application when reading or writing to a file that has been locked with fcntl(). NFS server is Ubuntu 20.04.6 LTS, GNU/Linux 5.4.0-139-generic x86_64. The problem does not appear to affect other combinations of NFS client (including AIX 7.2) with this NFS server. The AIX team have indicated that the cause of the EIO is triggered by the NFS server returning a BAD_SEQID error which leads to the AIX NFS client incorrectly zeroing the stateid, which then leads to the NFS server returning a BAD_STATEID error and the NFS client then returns the EIO error. The AIX team would like to understand why the BAD_SEQID has been returned. ---uname output--- Linux duckseason 5.4.0-156-generic #173-Ubuntu SMP Tue Jul 11 07:25:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux Machine Type = VMware ESXi Server 7.0 4 x Intel(R) Xeon(R) Gold 6348H CPU @ 2.30GHz ---Steps to Reproduce--- We cannot offer a simple way to recreate the problem as it involves IBM MQ running on two primary machines (AIX) using the Ubuntu server for it's HA NFSv4 storage. However, we can provide any requested trace or dumps from any or all of the involved machines. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2042363/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2053194] Re: latest kernel update breaks sata hotplug on z690
Could someone do git bisect to check which commit could cause the regression? My wild guess would be the commit which add Intel Alder Lake-P AHCI controller to low power chipsets list or others in the following. $ git log --oneline Ubuntu-5.15.0-92.102..Ubuntu-5.15.0-94.104 |grep "ata: ahci" c553eda3bac6 ata: ahci: Add Intel Alder Lake-P AHCI controller to low power chipsets list 0e2b3a2aa29d ata: ahci: Add Elkhart Lake AHCI controller 6fd0f4242184 ata: ahci: Rename board_ahci_mobile 65ecbaa1fe47 ata: ahci: Add support for AMD A85 FCH (Hudson D4) c21705b5ee4f ata: ahci: Drop pointless VPRINTK() calls and convert the remaining ones $ git log --oneline Ubuntu-hwe-6.5-6.5.0-15.15_22.04.1..Ubuntu-hwe-6.5-6.5.0-17.17_22.04.1|grep "ata: ahci" df508bb822f9 ata: ahci: Add Intel Alder Lake-P AHCI controller to low power chipsets list 2a4dad1ecdf3 ata: ahci: Add Elkhart Lake AHCI controller -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-signed-hwe-6.5 in Ubuntu. https://bugs.launchpad.net/bugs/2053194 Title: latest kernel update breaks sata hotplug on z690 Status in linux-signed-hwe-6.5 package in Ubuntu: Confirmed Bug description: SATA hotplug is not working on these kernels: linux-image-6.5.0-17-generic linux-image-5.15.0-94-generic SATA hotplug is working on these kernels: linux-image-6.5.0-15-generic linux-image-5.15.0-92-generic Affected platform: Gigabyte Z690 AORUS PRO BIOS Version: F28 Note that I can only repro this on Z690. I also tried Z390 and Z370 platforms and SATA hotplug it is working there. Steps to repro: 1. Enable SATA hotplug in BIOS 2. Boot to affected kernel 3. Hotplug a SATA drive while monitoring kernel messages ProblemType: Bug DistroRelease: Ubuntu 22.04 Package: linux-image-6.5.0-17-generic 6.5.0-17.17~22.04.1 ProcVersionSignature: Ubuntu 6.5.0-17.17~22.04.1-generic 6.5.8 Uname: Linux 6.5.0-17-generic x86_64 ApportVersion: 2.20.11-0ubuntu82.5 Architecture: amd64 CasperMD5CheckResult: pass CurrentDesktop: ubuntu:GNOME Date: Wed Feb 14 13:20:34 2024 InstallationDate: Installed on 2024-01-26 (18 days ago) InstallationMedia: Ubuntu 22.04.3 LTS "Jammy Jellyfish" - Release amd64 (20230807.2) ProcEnviron: TERM=xterm-256color PATH=(custom, no user) XDG_RUNTIME_DIR= LANG=en_US.UTF-8 SHELL=/bin/bash SourcePackage: linux-signed-hwe-6.5 UpgradeStatus: No upgrade log present (probably fresh install) modified.conffile..etc.default.apport: # set this to 0 to disable apport, or to 1 to enable it # you can temporarily override this with # sudo service apport start force_start=1 enabled=0 mtime.conffile..etc.default.apport: 2024-01-26T15:01:36.495491 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-signed-hwe-6.5/+bug/2053194/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2061091] Re: Freezing user space processes failed after 20.008 seconds (1 tasks refusing to freeze, wq_busy=0)
Looks suspend/resume can be completed sometimes. Apr 15 12:49:26 xps9320 kernel: [10214.971688] Freezing user space processes completed (elapsed 0.002 seconds) Apr 15 21:45:23 xps9320 kernel: [30355.137512] Freezing user space processes completed (elapsed 0.015 seconds) Apr 15 23:02:59 xps9320 kernel: [47737.172847] Freezing user space processes failed after 20.006 seconds (1 tasks refusing to freeze, wq_busy=0): Apr 15 23:03:19 xps9320 kernel: [47757.538206] Freezing user space processes failed after 20.001 seconds (1 tasks refusing to freeze, wq_busy=0): Apr 15 23:03:49 xps9320 kernel: [47787.948230] Freezing user space processes failed after 20.010 seconds (1 tasks refusing to freeze, wq_busy=0): Apr 15 23:04:10 xps9320 kernel: [47808.293913] Freezing user space processes failed after 20.004 seconds (1 tasks refusing to freeze, wq_busy=0): Apr 15 23:04:40 xps9320 kernel: [47838.963650] Freezing user space processes failed after 20.011 seconds (1 tasks refusing to freeze, wq_busy=0): Apr 15 23:05:01 xps9320 kernel: [47859.321047] Freezing user space processes failed after 20.007 seconds (1 tasks refusing to freeze, wq_busy=0): Apr 15 23:05:31 xps9320 kernel: [47889.930764] Freezing user space processes failed after 20.007 seconds (1 tasks refusing to freeze, wq_busy=0): Apr 15 23:05:52 xps9320 kernel: [47910.270362] Freezing user space processes failed after 20.008 seconds (1 tasks refusing to freeze, wq_busy=0): Apr 15 23:06:22 xps9320 kernel: [47940.942995] Freezing user space processes failed after 20.004 seconds (1 tasks refusing to freeze, wq_busy=0): Apr 15 23:06:43 xps9320 kernel: [47961.284299] Freezing user space processes failed after 20.004 seconds (1 tasks refusing to freeze, wq_busy=0): Apr 15 23:07:13 xps9320 kernel: [47991.938673] Freezing user space processes failed after 20.003 seconds (1 tasks refusing to freeze, wq_busy=0): Apr 15 23:07:34 xps9320 kernel: [48012.283148] Freezing user space processes failed after 20.004 seconds (1 tasks refusing to freeze, wq_busy=0): Apr 15 23:07:54 xps9320 kernel: [48032.682674] Freezing user space processes failed after 20.001 seconds (1 tasks refusing to freeze, wq_busy=0): Apr 15 23:08:15 xps9320 kernel: [48053.066382] Freezing user space processes failed after 20.001 seconds (1 tasks refusing to freeze, wq_busy=0): Apr 15 23:08:37 xps9320 kernel: [48075.379470] Freezing user space processes failed after 20.006 seconds (1 tasks refusing to freeze, wq_busy=0): Apr 15 23:08:57 xps9320 kernel: [48095.707188] Freezing user space processes failed after 20.001 seconds (1 tasks refusing to freeze, wq_busy=0): Apr 15 23:09:18 xps9320 kernel: [48116.090956] Freezing user space processes failed after 20.007 seconds (1 tasks refusing to freeze, wq_busy=0): Apr 15 23:09:38 xps9320 kernel: [48136.438119] Freezing user space processes failed after 20.006 seconds (1 tasks refusing to freeze, wq_busy=0): Apr 15 23:09:59 xps9320 kernel: [48157.256494] Freezing user space processes failed after 20.002 seconds (1 tasks refusing to freeze, wq_busy=0): Apr 15 23:10:19 xps9320 kernel: [48177.593855] Freezing user space processes failed after 20.006 seconds (1 tasks refusing to freeze, wq_busy=0): Apr 15 23:10:41 xps9320 kernel: [48199.473829] Freezing user space processes failed after 20.004 seconds (1 tasks refusing to freeze, wq_busy=0): Apr 15 23:11:01 xps9320 kernel: [48219.806060] Freezing user space processes failed after 20.006 seconds (1 tasks refusing to freeze, wq_busy=0): Apr 15 23:11:22 xps9320 kernel: [48240.188072] Freezing user space processes failed after 20.011 seconds (1 tasks refusing to freeze, wq_busy=0): Apr 15 23:11:42 xps9320 kernel: [48260.947835] Freezing user space processes failed after 20.005 seconds (1 tasks refusing to freeze, wq_busy=0): Apr 16 12:52:21 xps9320 kernel: [13407.856663] Freezing user space processes completed (elapsed 0.002 seconds) Could you rebuild kernel with CONFIG_PROVE_LOCKING option to discover locking related deadlocks? Then upload the log after reproduce the issue by shut down laptop with the new kernel. Also please attach the output of lsusb given it could be usb relevant. And it also would be helpful to try with latest noble kernel or recent upstream kernel, thanks. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-signed-hwe-6.5 in Ubuntu. https://bugs.launchpad.net/bugs/2061091 Title: Freezing user space processes failed after 20.008 seconds (1 tasks refusing to freeze, wq_busy=0) Status in linux-signed-hwe-6.5 package in Ubuntu: New Bug description: Sometimes, when trying to shut down or suspend my laptop, it gets stuck on the console screen. If I was suspending, it eventually gives up and goes back to the X session. During a shutdown it hangs forever and the only solution seems to be to force a reboot with Magic-SysRq. The following appears in `kern.log`: ``` Apr 12 08:59:54 xps9320 kernel:
[Kernel-packages] [Bug 2061091] Re: Freezing user space processes failed after 20.008 seconds (1 tasks refusing to freeze, wq_busy=0)
And this call trace looks is related to current issue, I guess snd_power_ref_and_wait was waiting for snd_card_disconnect which wakes up power_sleep at the end of the code, but snd_card_disconnect -> snd_device_disconnect_all -> snd_pcm_dev_disconnect was blocked for some reason. Apr 15 21:48:23 xps9320 kernel: [43261.751897] INFO: task kworker/0:6:34416 blocked for more than 120 seconds. Apr 15 21:48:23 xps9320 kernel: [43261.751900] Tainted: G OE 6.5.0-27-generic #28~22.04.1-Ubuntu Apr 15 21:48:23 xps9320 kernel: [43261.751901] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 15 21:48:23 xps9320 kernel: [43261.751903] task:kworker/0:6 state:D stack:0 pid:34416 ppid:2 flags:0x4000 Apr 15 21:48:23 xps9320 kernel: [43261.751907] Workqueue: usb_hub_wq hub_event Apr 15 21:48:23 xps9320 kernel: [43261.751913] Call Trace: Apr 15 21:48:23 xps9320 kernel: [43261.751915] Apr 15 21:48:23 xps9320 kernel: [43261.751916] __schedule+0x2cb/0x750 Apr 15 21:48:23 xps9320 kernel: [43261.751921] schedule+0x63/0x110 Apr 15 21:48:23 xps9320 kernel: [43261.751924] schedule_preempt_disabled+0x15/0x30 Apr 15 21:48:23 xps9320 kernel: [43261.751928] rwsem_down_write_slowpath+0x2a2/0x550 Apr 15 21:48:23 xps9320 kernel: [43261.751940] down_write+0x5c/0x80 Apr 15 21:48:23 xps9320 kernel: [43261.751945] snd_pcm_dev_disconnect+0x1d2/0x280 [snd_pcm] Apr 15 21:48:23 xps9320 kernel: [43261.751964] snd_device_disconnect_all+0x47/0xa0 [snd] Apr 15 21:48:23 xps9320 kernel: [43261.751979] snd_card_disconnect.part.0+0x10d/0x290 [snd] Apr 15 21:48:23 xps9320 kernel: [43261.752019] ? rpm_idle+0x25/0x2b0 Apr 15 21:48:23 xps9320 kernel: [43261.752023] snd_card_disconnect+0x13/0x30 [snd] Apr 15 21:48:23 xps9320 kernel: [43261.752039] usb_audio_disconnect+0x114/0x2c0 [snd_usb_audio] Apr 15 21:48:23 xps9320 kernel: [43261.752064] usb_unbind_interface+0x8e/0x280 Apr 15 21:48:23 xps9320 kernel: [43261.752069] device_remove+0x65/0x80 Apr 15 21:48:23 xps9320 kernel: [43261.752072] device_release_driver_internal+0x20b/0x270 Apr 15 21:48:23 xps9320 kernel: [43261.752077] device_release_driver+0x12/0x20 Apr 15 21:48:23 xps9320 kernel: [43261.752080] bus_remove_device+0xcb/0x140 Apr 15 21:48:23 xps9320 kernel: [43261.752083] device_del+0x161/0x3e0 Apr 15 21:48:23 xps9320 kernel: [43261.752086] ? kobject_put+0x67/0xa0 Apr 15 21:48:23 xps9320 kernel: [43261.752090] usb_disable_device+0xd5/0x280 Apr 15 21:48:23 xps9320 kernel: [43261.752093] usb_disconnect+0xe9/0x2e0 Apr 15 21:48:23 xps9320 kernel: [43261.752097] usb_disconnect+0xcd/0x2e0 Apr 15 21:48:23 xps9320 kernel: [43261.752100] usb_disconnect+0xcd/0x2e0 Apr 15 21:48:23 xps9320 kernel: [43261.752102] ? usb_control_msg+0x106/0x160 Apr 15 21:48:23 xps9320 kernel: [43261.752105] hub_port_connect+0x90/0xc30 Apr 15 21:48:23 xps9320 kernel: [43261.752109] hub_port_connect_change+0x91/0x300 Apr 15 21:48:23 xps9320 kernel: [43261.752113] port_event+0x652/0x810 Apr 15 21:48:23 xps9320 kernel: [43261.752117] hub_event+0x155/0x450 Apr 15 21:48:23 xps9320 kernel: [43261.752120] process_one_work+0x23d/0x450 Apr 15 21:48:23 xps9320 kernel: [43261.752125] worker_thread+0x50/0x3f0 Apr 15 21:48:23 xps9320 kernel: [43261.752128] ? __pfx_worker_thread+0x10/0x10 Apr 15 21:48:23 xps9320 kernel: [43261.752131] kthread+0xef/0x120 Apr 15 21:48:23 xps9320 kernel: [43261.752135] ? __pfx_kthread+0x10/0x10 Apr 15 21:48:23 xps9320 kernel: [43261.752139] ret_from_fork+0x44/0x70 Apr 15 21:48:23 xps9320 kernel: [43261.752143] ? __pfx_kthread+0x10/0x10 Apr 15 21:48:23 xps9320 kernel: [43261.752147] ret_from_fork_asm+0x1b/0x30 Apr 15 21:48:23 xps9320 kernel: [43261.752151] -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-signed-hwe-6.5 in Ubuntu. https://bugs.launchpad.net/bugs/2061091 Title: Freezing user space processes failed after 20.008 seconds (1 tasks refusing to freeze, wq_busy=0) Status in linux-signed-hwe-6.5 package in Ubuntu: New Bug description: Sometimes, when trying to shut down or suspend my laptop, it gets stuck on the console screen. If I was suspending, it eventually gives up and goes back to the X session. During a shutdown it hangs forever and the only solution seems to be to force a reboot with Magic-SysRq. The following appears in `kern.log`: ``` Apr 12 08:59:54 xps9320 kernel: [173172.510341] Freezing user space processes failed after 20.008 seconds (1 tasks refusing to freeze, wq_busy=0): Apr 12 08:59:54 xps9320 kernel: [173172.515669] task:wireplumber state:D stack:0 pid:2408 ppid:2398 flags:0x0006 Apr 12 08:59:54 xps9320 kernel: [173172.518923] Call Trace: Apr 12 08:59:54 xps9320 kernel: [173172.521755] Apr 12 08:59:54 xps9320 kernel: [173172.524099] __schedule+0x2cb/0x750 Apr 12 08:59:54 xps9320 kernel: [173172.526333] schedule+0x63/0x110 Apr 12 08:59:54 xps9320 kernel:
[Kernel-packages] [Bug 2061091] Re: Freezing user space processes failed after 20.008 seconds (1 tasks refusing to freeze, wq_busy=0)
There is another calltrace in kern.log which might be another issue (probably need a separated bug report). Apr 15 21:45:23 xps9320 kernel: [43079.109011] Apr 15 21:45:23 xps9320 kernel: [43079.112281] UBSAN: shift-out-of-bounds in /build/linux-hwe-6.5-jkqeMi/linux-hwe-6.5-6.5.0/drivers/gpu/drm/display/drm_dp_mst_topology.c:4416:36 Apr 15 21:45:23 xps9320 kernel: [43079.115601] shift exponent -1 is negative Apr 15 21:45:23 xps9320 kernel: [43079.119073] CPU: 0 PID: 34404 Comm: kworker/0:3 Tainted: G OE 6.5.0-27-generic #28~22.04.1-Ubuntu Apr 15 21:45:23 xps9320 kernel: [43079.122523] Hardware name: Dell Inc. XPS 9320/0CW9KM, BIOS 2.8.0 11/13/2023 Apr 15 21:45:23 xps9320 kernel: [43079.126024] Workqueue: events output_poll_execute [drm_kms_helper] Apr 15 21:45:23 xps9320 kernel: [43079.129530] Call Trace: Apr 15 21:45:23 xps9320 kernel: [43079.132985] Apr 15 21:45:23 xps9320 kernel: [43079.136412] dump_stack_lvl+0x48/0x70 Apr 15 21:45:23 xps9320 kernel: [43079.139805] dump_stack+0x10/0x20 Apr 15 21:45:23 xps9320 kernel: [43079.143170] __ubsan_handle_shift_out_of_bounds+0x1ac/0x360 Apr 15 21:45:23 xps9320 kernel: [43079.146525] drm_dp_atomic_release_time_slots.cold+0x17/0x3d [drm_display_helper] Apr 15 21:45:23 xps9320 kernel: [43079.149904] intel_dp_mst_atomic_check+0xaa/0x180 [i915] Apr 15 21:45:23 xps9320 kernel: [43079.153378] ? update_connector_routing+0x2fc/0x3f0 [drm_kms_helper] Apr 15 21:45:23 xps9320 kernel: [43079.156738] drm_atomic_helper_check_modeset+0x300/0x610 [drm_kms_helper] Apr 15 21:45:23 xps9320 kernel: [43079.160084] intel_atomic_check+0xfe/0xb80 [i915] Apr 15 21:45:23 xps9320 kernel: [43079.163518] ? drm_plane_check_pixel_format+0x53/0xe0 [drm] Apr 15 21:45:23 xps9320 kernel: [43079.166858] drm_atomic_check_only+0x1ac/0x400 [drm] Apr 15 21:45:23 xps9320 kernel: [43079.170172] ? update_output_state+0x184/0x1a0 [drm] Apr 15 21:45:23 xps9320 kernel: [43079.173478] drm_atomic_commit+0x58/0xd0 [drm] Apr 15 21:45:23 xps9320 kernel: [43079.176765] ? __pfx___drm_printfn_info+0x10/0x10 [drm] Apr 15 21:45:23 xps9320 kernel: [43079.180049] drm_client_modeset_commit_atomic+0x203/0x240 [drm] Apr 15 21:45:23 xps9320 kernel: [43079.183332] drm_client_modeset_commit_locked+0x5b/0x170 [drm] Apr 15 21:45:23 xps9320 kernel: [43079.186586] ? mutex_lock+0x12/0x50 Apr 15 21:45:23 xps9320 kernel: [43079.189776] drm_client_modeset_commit+0x26/0x50 [drm] Apr 15 21:45:23 xps9320 kernel: [43079.192980] __drm_fb_helper_restore_fbdev_mode_unlocked+0xc2/0x100 [drm_kms_helper] Apr 15 21:45:23 xps9320 kernel: [43079.196179] drm_fb_helper_hotplug_event+0x10b/0x120 [drm_kms_helper] Apr 15 21:45:23 xps9320 kernel: [43079.199336] intel_fbdev_output_poll_changed+0x6b/0xb0 [i915] Apr 15 21:45:23 xps9320 kernel: [43079.202568] output_poll_execute+0x237/0x280 [drm_kms_helper] Apr 15 21:45:23 xps9320 kernel: [43079.205686] ? __schedule+0x2d3/0x750 Apr 15 21:45:23 xps9320 kernel: [43079.208777] process_one_work+0x23d/0x450 Apr 15 21:45:23 xps9320 kernel: [43079.211856] worker_thread+0x50/0x3f0 Apr 15 21:45:23 xps9320 kernel: [43079.214921] ? __pfx_worker_thread+0x10/0x10 Apr 15 21:45:23 xps9320 kernel: [43079.217984] kthread+0xef/0x120 Apr 15 21:45:23 xps9320 kernel: [43079.221037] ? __pfx_kthread+0x10/0x10 Apr 15 21:45:23 xps9320 kernel: [43079.224072] ret_from_fork+0x44/0x70 Apr 15 21:45:23 xps9320 kernel: [43079.227100] ? __pfx_kthread+0x10/0x10 Apr 15 21:45:23 xps9320 kernel: [43079.230123] ret_from_fork_asm+0x1b/0x30 Apr 15 21:45:23 xps9320 kernel: [43079.233124] Apr 15 21:45:23 xps9320 kernel: [43079.235686] -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-signed-hwe-6.5 in Ubuntu. https://bugs.launchpad.net/bugs/2061091 Title: Freezing user space processes failed after 20.008 seconds (1 tasks refusing to freeze, wq_busy=0) Status in linux-signed-hwe-6.5 package in Ubuntu: New Bug description: Sometimes, when trying to shut down or suspend my laptop, it gets stuck on the console screen. If I was suspending, it eventually gives up and goes back to the X session. During a shutdown it hangs forever and the only solution seems to be to force a reboot with Magic-SysRq. The following appears in `kern.log`: ``` Apr 12 08:59:54 xps9320 kernel: [173172.510341] Freezing user space processes failed after 20.008 seconds (1 tasks refusing to freeze, wq_busy=0): Apr 12 08:59:54 xps9320 kernel: [173172.515669] task:wireplumber state:D stack:0 pid:2408 ppid:2398 flags:0x0006 Apr 12 08:59:54 xps9320 kernel: [173172.518923] Call Trace: Apr 12 08:59:54 xps9320 kernel: [173172.521755] Apr 12 08:59:54 xps9320 kernel: [173172.524099] __schedule+0x2cb/0x750 Apr 12 08:59:54 xps9320 kernel:
[Kernel-packages] [Bug 2061091] Re: Freezing user space processes failed after 20.008 seconds (1 tasks refusing to freeze, wq_busy=0)
Could you share which audio driver is used in the affected system by "cat /proc/asound/modules"? And pls attach the full kern.log to check if something is suspicious here. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-signed-hwe-6.5 in Ubuntu. https://bugs.launchpad.net/bugs/2061091 Title: Freezing user space processes failed after 20.008 seconds (1 tasks refusing to freeze, wq_busy=0) Status in linux-signed-hwe-6.5 package in Ubuntu: New Bug description: Sometimes, when trying to shut down or suspend my laptop, it gets stuck on the console screen. If I was suspending, it eventually gives up and goes back to the X session. During a shutdown it hangs forever and the only solution seems to be to force a reboot with Magic-SysRq. The following appears in `kern.log`: ``` Apr 12 08:59:54 xps9320 kernel: [173172.510341] Freezing user space processes failed after 20.008 seconds (1 tasks refusing to freeze, wq_busy=0): Apr 12 08:59:54 xps9320 kernel: [173172.515669] task:wireplumber state:D stack:0 pid:2408 ppid:2398 flags:0x0006 Apr 12 08:59:54 xps9320 kernel: [173172.518923] Call Trace: Apr 12 08:59:54 xps9320 kernel: [173172.521755] Apr 12 08:59:54 xps9320 kernel: [173172.524099] __schedule+0x2cb/0x750 Apr 12 08:59:54 xps9320 kernel: [173172.526333] schedule+0x63/0x110 Apr 12 08:59:54 xps9320 kernel: [173172.528585] snd_power_ref_and_wait+0xe5/0x140 [snd] Apr 12 08:59:54 xps9320 kernel: [173172.530825] ? __pfx_autoremove_wake_function+0x10/0x10 Apr 12 08:59:54 xps9320 kernel: [173172.533103] snd_ctl_elem_info+0x4f/0x1b0 [snd] Apr 12 08:59:54 xps9320 kernel: [173172.535354] snd_ctl_elem_info_user+0x59/0xc0 [snd] Apr 12 08:59:54 xps9320 kernel: [173172.537598] snd_ctl_ioctl+0x1d4/0x650 [snd] Apr 12 08:59:54 xps9320 kernel: [173172.539846] ? __fget_light+0xa5/0x120 Apr 12 08:59:54 xps9320 kernel: [173172.542082] __x64_sys_ioctl+0xa0/0xf0 Apr 12 08:59:54 xps9320 kernel: [173172.544334] do_syscall_64+0x58/0x90 Apr 12 08:59:54 xps9320 kernel: [173172.546566] ? syscall_exit_to_user_mode+0x37/0x60 Apr 12 08:59:54 xps9320 kernel: [173172.548838] ? do_syscall_64+0x67/0x90 Apr 12 08:59:54 xps9320 kernel: [173172.551085] ? do_syscall_64+0x67/0x90 Apr 12 08:59:54 xps9320 kernel: [173172.553342] ? syscall_exit_to_user_mode+0x37/0x60 Apr 12 08:59:54 xps9320 kernel: [173172.96] ? do_syscall_64+0x67/0x90 Apr 12 08:59:54 xps9320 kernel: [173172.557828] ? common_interrupt+0x54/0xb0 Apr 12 08:59:54 xps9320 kernel: [173172.560075] entry_SYSCALL_64_after_hwframe+0x6e/0xd8 Apr 12 08:59:54 xps9320 kernel: [173172.562321] RIP: 0033:0x70f3adb1a94f Apr 12 08:59:54 xps9320 kernel: [173172.564650] RSP: 002b:7ffef2072940 EFLAGS: 0246 ORIG_RAX: 0010 Apr 12 08:59:54 xps9320 kernel: [173172.566925] RAX: ffda RBX: 7ffef20729b0 RCX: 70f3adb1a94f Apr 12 08:59:54 xps9320 kernel: [173172.569222] RDX: 7ffef20729b0 RSI: c1105511 RDI: 0022 Apr 12 08:59:54 xps9320 kernel: [173172.571501] RBP: 7ffef2072b90 R08: 65107d2b6070 R09: 0004 Apr 12 08:59:54 xps9320 kernel: [173172.573762] R10: f014 R11: 0246 R12: 65107d2ee100 Apr 12 08:59:54 xps9320 kernel: [173172.576047] R13: 65107d1fb400 R14: 7ffef2072b30 R15: 7ffef2072ad0 Apr 12 08:59:54 xps9320 kernel: [173172.578308] ``` Another point of interest is that, when in this situation: * `lsusb` hangs after printing out a few lines * Outgoing SSH connections hang unless I clear SSH_AUTH_SOCK. gpg-agent appears to be trying to check for smartcards. So I guess there is some sort of deadlock in the USB subsystem which is causing all the other problems? ProblemType: Bug DistroRelease: Ubuntu 22.04 Package: linux-image-6.5.0-27-generic 6.5.0-27.28~22.04.1 ProcVersionSignature: Ubuntu 6.5.0-27.28~22.04.1-generic 6.5.13 Uname: Linux 6.5.0-27-generic x86_64 ApportVersion: 2.20.11-0ubuntu82.5 Architecture: amd64 CasperMD5CheckResult: pass CurrentDesktop: ubuntu:GNOME Date: Fri Apr 12 09:42:52 2024 InstallationDate: Installed on 2022-06-28 (653 days ago) InstallationMedia: Xubuntu 22.04 LTS "Jammy Jellyfish" - Release amd64 (20220419) SourcePackage: linux-signed-hwe-6.5 UpgradeStatus: No upgrade log present (probably fresh install) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-signed-hwe-6.5/+bug/2061091/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2042363] Re: AIX 7.3 NFS client frequently returns an EIO error to an application when reading or writing to a file that has been locked with fcntl() on a Ubuntu 20.04 NFSV4 ser
Per below from the trace file Nov 30 11:13:40 duckseason kernel: [1291756.354728] nfsd_dispatch: vers 4 proc 1 Nov 30 11:13:40 duckseason kernel: [1291756.354731] svc: server 7c7e7536, pool 0, transport 3fd86d34, inuse=3 Nov 30 11:13:40 duckseason kernel: [1291756.354732] process_renew(6554b87b/4ab45507): starting Nov 30 11:13:40 duckseason kernel: [1291756.354734] svc: tcp_recv 3fd86d34 data 1 conn 0 close 0 Nov 30 11:13:40 duckseason kernel: [1291756.354736] svc: socket 3fd86d34 recvfrom(03fecffb, 4) = -11 Nov 30 11:13:40 duckseason kernel: [1291756.354737] RPC: TCP recv_record got -11 Nov 30 11:13:40 duckseason kernel: [1291756.354737] RPC: TCP recvfrom got EAGAIN we can see NFS server return -11 (EAGAIN), which can be executed from from the path, svc_recv -> svc_handle_xprt -> xprt->xpt_ops->xpo_recvfrom svc_tcp_recvfrom -> svc_recvfrom -> sock_recvmsg which probably triggers sock_recvmsg_nosec -> ... -> tcp_recvmsg As mentioned in recvfrom manpage, ERRORS The recvfrom() function shall fail if: EAGAIN or EWOULDBLOCK The socket's file descriptor is marked O_NONBLOCK and no data is waiting to be received; or MSG_OOB is set and no out-of-band data is available and either the socket's file descriptor is marked O_NONBLOCK or the socket does not support blocking to await out-of-band data. I am not sure if 7.3 NFS client opened non-blocking socket and no data on that socket to be read. So I would like to check if 7.3 client sent something different compared with 7.2 client which caused server returned BAD_SEQID to AIX 7.3 client. Please also collect relevant trace log from server side when connecting with 7.2 client, then we can investigate the difference between good one and bad one. If possible, maybe you can try with the latest 5.4 stable (5.4.274) and upstream version (6.9-rc4). -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2042363 Title: AIX 7.3 NFS client frequently returns an EIO error to an application when reading or writing to a file that has been locked with fcntl() on a Ubuntu 20.04 NFSV4 server Status in linux package in Ubuntu: New Bug description: ---Problem Description--- AIX 7.3 NFS client frequently returns an EIO error to an application when reading or writing to a file that has been locked with fcntl(). NFS server is Ubuntu 20.04.6 LTS, GNU/Linux 5.4.0-139-generic x86_64. The problem does not appear to affect other combinations of NFS client (including AIX 7.2) with this NFS server. The AIX team have indicated that the cause of the EIO is triggered by the NFS server returning a BAD_SEQID error which leads to the AIX NFS client incorrectly zeroing the stateid, which then leads to the NFS server returning a BAD_STATEID error and the NFS client then returns the EIO error. The AIX team would like to understand why the BAD_SEQID has been returned. ---uname output--- Linux duckseason 5.4.0-156-generic #173-Ubuntu SMP Tue Jul 11 07:25:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux Machine Type = VMware ESXi Server 7.0 4 x Intel(R) Xeon(R) Gold 6348H CPU @ 2.30GHz ---Steps to Reproduce--- We cannot offer a simple way to recreate the problem as it involves IBM MQ running on two primary machines (AIX) using the Ubuntu server for it's HA NFSv4 storage. However, we can provide any requested trace or dumps from any or all of the involved machines. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2042363/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2058750] Re: i915 GPU HANG: ecode 12:1:db96edba
Looks the log from #comment3 is similar as this link. https://www.reddit.com/r/archlinux/comments/14zifl8/gpu_hang_issue_on_laptop_had_to_hard_shutdown/ Could you try with latest firmware from upstream? The latest version is 70.20.0 per the log. https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux- firmware.git/log/i915/adlp_guc_70.bin -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-signed-hwe-6.5 in Ubuntu. https://bugs.launchpad.net/bugs/2058750 Title: i915 GPU HANG: ecode 12:1:db96edba Status in linux-signed-hwe-6.5 package in Ubuntu: New Bug description: This has happened several times over a period of months when visiting the site https://www.windy.com and viewing satellite images. The computer completely freezes and must be hard restarted. I don't think I have triggered this freeze by visiting any other site. ProblemType: Bug DistroRelease: Ubuntu 22.04 Package: linux-image-6.5.0-26-generic 6.5.0-26.26~22.04.1 ProcVersionSignature: Ubuntu 6.5.0-26.26~22.04.1-generic 6.5.13 Uname: Linux 6.5.0-26-generic x86_64 ApportVersion: 2.20.11-0ubuntu82.5 Architecture: amd64 CasperMD5CheckResult: pass CurrentDesktop: ubuntu:GNOME Date: Fri Mar 22 14:14:41 2024 InstallationDate: Installed on 2023-04-12 (344 days ago) InstallationMedia: Ubuntu 22.04.2 LTS "Jammy Jellyfish" - Release amd64 (20230223) SourcePackage: linux-signed-hwe-6.5 UpgradeStatus: No upgrade log present (probably fresh install) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-signed-hwe-6.5/+bug/2058750/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2060217] Re: NFSv4 fails to mount in noble/s390x
Thanks Vasily. After it is merged by upstream maintainer, maybe you can send it to ubuntu kernel list as well, or wait until noble update to future upstream stable release since the patch has been cced to sta...@vger.kernel.org and also has fixes tag. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2060217 Title: NFSv4 fails to mount in noble/s390x Status in Ubuntu on IBM z Systems: New Status in linux package in Ubuntu: New Status in nfs-utils package in Ubuntu: Triaged Bug description: https://autopkgtest.ubuntu.com/packages/n/nfs-utils/noble/s390x Looks like it has been failing for a long time already. Log: https://autopkgtest.ubuntu.com/results/autopkgtest- noble/noble/s390x/n/nfs-utils/20240404_145924_ef255@/log.gz 339s autopkgtest [14:41:04]: test local-server-client: [--- 340s Killed 340s autopkgtest [14:41:05]: test process requested reboot with marker boot1 364s autopkgtest-virt-ssh: WARNING: ssh connection failed. Retrying in 3 seconds... 372s FAIL: nfs_home not mounted 373s autopkgtest [14:41:38]: test local-server-client: ---] 373s local-server-client FAIL non-zero exit status 1 and 934s autopkgtest [14:50:59]: test kerberos-mount: [--- 935s Initializing database '/var/lib/krb5kdc/principal' for realm 'DEP8', 935s master key name 'K/M@DEP8' 935s Authenticating as principal root/admin@DEP8 with password. 935s Principal "nfs/nfs-server.dep8@DEP8" created. 935s Authenticating as principal root/admin@DEP8 with password. 935s Entry for principal nfs/nfs-server.dep8 with kvno 2, encryption type aes256-cts-hmac-sha1-96 added to keytab FILE:/etc/krb5.keytab. 935s Entry for principal nfs/nfs-server.dep8 with kvno 2, encryption type aes128-cts-hmac-sha1-96 added to keytab FILE:/etc/krb5.keytab. 935s Authenticating as principal root/admin@DEP8 with password. 935s Principal "host/nfs-server.dep8@DEP8" created. 935s Authenticating as principal root/admin@DEP8 with password. 935s Entry for principal host/nfs-server.dep8 with kvno 2, encryption type aes256-cts-hmac-sha1-96 added to keytab FILE:/etc/krb5.keytab. 935s Entry for principal host/nfs-server.dep8 with kvno 2, encryption type aes128-cts-hmac-sha1-96 added to keytab FILE:/etc/krb5.keytab. 936s exporting *:/storage 938s mount.nfs: mount system call failed for /mnt 938s umount: /mnt: not mounted. 938s autopkgtest [14:51:02]: test kerberos-mount: ---] 939s kerberos-mount FAIL non-zero exit status 32 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-z-systems/+bug/2060217/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2060217] Re: NFSv4 fails to mount in noble/s390x
Thanks for the info, and I tested v6.6 which was fine. Will dig into about the change. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2060217 Title: NFSv4 fails to mount in noble/s390x Status in Ubuntu on IBM z Systems: New Status in linux package in Ubuntu: New Status in nfs-utils package in Ubuntu: Triaged Bug description: https://autopkgtest.ubuntu.com/packages/n/nfs-utils/noble/s390x Looks like it has been failing for a long time already. Log: https://autopkgtest.ubuntu.com/results/autopkgtest- noble/noble/s390x/n/nfs-utils/20240404_145924_ef255@/log.gz 339s autopkgtest [14:41:04]: test local-server-client: [--- 340s Killed 340s autopkgtest [14:41:05]: test process requested reboot with marker boot1 364s autopkgtest-virt-ssh: WARNING: ssh connection failed. Retrying in 3 seconds... 372s FAIL: nfs_home not mounted 373s autopkgtest [14:41:38]: test local-server-client: ---] 373s local-server-client FAIL non-zero exit status 1 and 934s autopkgtest [14:50:59]: test kerberos-mount: [--- 935s Initializing database '/var/lib/krb5kdc/principal' for realm 'DEP8', 935s master key name 'K/M@DEP8' 935s Authenticating as principal root/admin@DEP8 with password. 935s Principal "nfs/nfs-server.dep8@DEP8" created. 935s Authenticating as principal root/admin@DEP8 with password. 935s Entry for principal nfs/nfs-server.dep8 with kvno 2, encryption type aes256-cts-hmac-sha1-96 added to keytab FILE:/etc/krb5.keytab. 935s Entry for principal nfs/nfs-server.dep8 with kvno 2, encryption type aes128-cts-hmac-sha1-96 added to keytab FILE:/etc/krb5.keytab. 935s Authenticating as principal root/admin@DEP8 with password. 935s Principal "host/nfs-server.dep8@DEP8" created. 935s Authenticating as principal root/admin@DEP8 with password. 935s Entry for principal host/nfs-server.dep8 with kvno 2, encryption type aes256-cts-hmac-sha1-96 added to keytab FILE:/etc/krb5.keytab. 935s Entry for principal host/nfs-server.dep8 with kvno 2, encryption type aes128-cts-hmac-sha1-96 added to keytab FILE:/etc/krb5.keytab. 936s exporting *:/storage 938s mount.nfs: mount system call failed for /mnt 938s umount: /mnt: not mounted. 938s autopkgtest [14:51:02]: test kerberos-mount: ---] 939s kerberos-mount FAIL non-zero exit status 32 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-z-systems/+bug/2060217/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2057734] Re: proc_sched_rt01 from ubuntu_ltp failed
jammy will include them after update to v5.15.150 ** Changed in: linux (Ubuntu Jammy) Status: New => Won't Fix -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2057734 Title: proc_sched_rt01 from ubuntu_ltp failed Status in ubuntu-kernel-tests: New Status in linux package in Ubuntu: Invalid Status in linux source package in Focal: Confirmed Status in linux source package in Jammy: Confirmed Status in linux source package in Mantic: In Progress Bug description: This is a new test case, issue found on M/J/F/B when testing LTP update 20240312 Test log: INFO: Test start time: Tue Mar 12 11:52:21 UTC 2024 COMMAND:/opt/ltp/bin/ltp-pan -q -e -S -a 163430 -n 163430 -f /tmp/ltp-X3Nz2HWCQe/alltests -l /dev/null -C /dev/null -T /dev/null LOG File: /dev/null FAILED COMMAND File: /dev/null TCONF COMMAND File: /dev/null Running tests... tst_kconfig.c:87: TINFO: Parsing kernel config '/lib/modules/6.5.0-27-generic/build/.config' tst_test.c:1741: TINFO: LTP version: 20230929-406-gcbc2d0568 tst_test.c:1625: TINFO: Timeout per run is 0h 00m 30s proc_sched_rt01.c:45: TFAIL: Expect: timeslice_ms > 0 after reset to default proc_sched_rt01.c:51: TPASS: echo 0 > /proc/sys/kernel/sched_rt_period_us : EINVAL (22) proc_sched_rt01.c:53: TFAIL: echo -1 > /proc/sys/kernel/sched_rt_period_us invalid retval 2: SUCCESS (0) proc_sched_rt01.c:59: TPASS: echo -2 > /proc/sys/kernel/sched_rt_runtime_us : EINVAL (22) proc_sched_rt01.c:72: TFAIL: echo rt_period_us+1 > /proc/sys/kernel/sched_rt_runtime_us invalid retval 1: SUCCESS (0) HINT: You _MAY_ be missing kernel fixes: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c1fc6484e1fb https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=079be8fc6309 Summary: passed 2 failed 3 broken 0 skipped 0 warnings 0 INFO: ltp-pan reported some tests FAIL LTP Version: 20230929-406-gcbc2d0568 INFO: Test end time: Tue Mar 12 11:52:21 UTC 2024 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/2057734/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2057734] Re: proc_sched_rt01 from ubuntu_ltp failed
focal master-next has those fix commits ** Also affects: linux (Ubuntu Focal) Importance: Undecided Status: New ** Also affects: linux (Ubuntu Jammy) Importance: Undecided Status: New ** Changed in: linux (Ubuntu Focal) Status: New => Won't Fix -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2057734 Title: proc_sched_rt01 from ubuntu_ltp failed Status in ubuntu-kernel-tests: New Status in linux package in Ubuntu: Invalid Status in linux source package in Focal: Confirmed Status in linux source package in Jammy: Confirmed Status in linux source package in Mantic: In Progress Bug description: This is a new test case, issue found on M/J/F/B when testing LTP update 20240312 Test log: INFO: Test start time: Tue Mar 12 11:52:21 UTC 2024 COMMAND:/opt/ltp/bin/ltp-pan -q -e -S -a 163430 -n 163430 -f /tmp/ltp-X3Nz2HWCQe/alltests -l /dev/null -C /dev/null -T /dev/null LOG File: /dev/null FAILED COMMAND File: /dev/null TCONF COMMAND File: /dev/null Running tests... tst_kconfig.c:87: TINFO: Parsing kernel config '/lib/modules/6.5.0-27-generic/build/.config' tst_test.c:1741: TINFO: LTP version: 20230929-406-gcbc2d0568 tst_test.c:1625: TINFO: Timeout per run is 0h 00m 30s proc_sched_rt01.c:45: TFAIL: Expect: timeslice_ms > 0 after reset to default proc_sched_rt01.c:51: TPASS: echo 0 > /proc/sys/kernel/sched_rt_period_us : EINVAL (22) proc_sched_rt01.c:53: TFAIL: echo -1 > /proc/sys/kernel/sched_rt_period_us invalid retval 2: SUCCESS (0) proc_sched_rt01.c:59: TPASS: echo -2 > /proc/sys/kernel/sched_rt_runtime_us : EINVAL (22) proc_sched_rt01.c:72: TFAIL: echo rt_period_us+1 > /proc/sys/kernel/sched_rt_runtime_us invalid retval 1: SUCCESS (0) HINT: You _MAY_ be missing kernel fixes: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c1fc6484e1fb https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=079be8fc6309 Summary: passed 2 failed 3 broken 0 skipped 0 warnings 0 INFO: ltp-pan reported some tests FAIL LTP Version: 20230929-406-gcbc2d0568 INFO: Test end time: Tue Mar 12 11:52:21 UTC 2024 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/2057734/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2057734] Re: proc_sched_rt01 from ubuntu_ltp failed
** Also affects: linux (Ubuntu Mantic) Importance: Undecided Status: New ** Changed in: linux (Ubuntu Mantic) Importance: Undecided => Medium ** Changed in: linux (Ubuntu Mantic) Status: New => In Progress ** Changed in: linux (Ubuntu Mantic) Assignee: (unassigned) => GuoqingJiang (guoqingjiang) ** Changed in: linux (Ubuntu) Status: New => Invalid ** Changed in: linux (Ubuntu) Milestone: mantic-updates => noble-updates -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2057734 Title: proc_sched_rt01 from ubuntu_ltp failed Status in ubuntu-kernel-tests: New Status in linux package in Ubuntu: Invalid Status in linux source package in Focal: New Status in linux source package in Jammy: New Status in linux source package in Mantic: In Progress Bug description: This is a new test case, issue found on M/J/F/B when testing LTP update 20240312 Test log: INFO: Test start time: Tue Mar 12 11:52:21 UTC 2024 COMMAND:/opt/ltp/bin/ltp-pan -q -e -S -a 163430 -n 163430 -f /tmp/ltp-X3Nz2HWCQe/alltests -l /dev/null -C /dev/null -T /dev/null LOG File: /dev/null FAILED COMMAND File: /dev/null TCONF COMMAND File: /dev/null Running tests... tst_kconfig.c:87: TINFO: Parsing kernel config '/lib/modules/6.5.0-27-generic/build/.config' tst_test.c:1741: TINFO: LTP version: 20230929-406-gcbc2d0568 tst_test.c:1625: TINFO: Timeout per run is 0h 00m 30s proc_sched_rt01.c:45: TFAIL: Expect: timeslice_ms > 0 after reset to default proc_sched_rt01.c:51: TPASS: echo 0 > /proc/sys/kernel/sched_rt_period_us : EINVAL (22) proc_sched_rt01.c:53: TFAIL: echo -1 > /proc/sys/kernel/sched_rt_period_us invalid retval 2: SUCCESS (0) proc_sched_rt01.c:59: TPASS: echo -2 > /proc/sys/kernel/sched_rt_runtime_us : EINVAL (22) proc_sched_rt01.c:72: TFAIL: echo rt_period_us+1 > /proc/sys/kernel/sched_rt_runtime_us invalid retval 1: SUCCESS (0) HINT: You _MAY_ be missing kernel fixes: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c1fc6484e1fb https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=079be8fc6309 Summary: passed 2 failed 3 broken 0 skipped 0 warnings 0 INFO: ltp-pan reported some tests FAIL LTP Version: 20230929-406-gcbc2d0568 INFO: Test end time: Tue Mar 12 11:52:21 UTC 2024 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/2057734/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2057734] Re: proc_sched_rt01 from ubuntu_ltp failed
Given ESM kernel only accepts CVE patches, will only send patches against Mantic [Impact] The updated LTP has added proc_sched_rt01 testcase which can't pass since several commits are missed from kernel side. Test log: INFO: Test start time: Tue Mar 12 11:52:21 UTC 2024 COMMAND: /opt/ltp/bin/ltp-pan -q -e -S -a 163430 -n 163430 -f /tmp/ltp-X3Nz2HWCQe/alltests -l /dev/null -C /dev/null -T /dev/null LOG File: /dev/null FAILED COMMAND File: /dev/null TCONF COMMAND File: /dev/null Running tests... tst_kconfig.c:87: TINFO: Parsing kernel config '/lib/modules/6.5.0-27-generic/build/.config' tst_test.c:1741: TINFO: LTP version: 20230929-406-gcbc2d0568 tst_test.c:1625: TINFO: Timeout per run is 0h 00m 30s proc_sched_rt01.c:45: TFAIL: Expect: timeslice_ms > 0 after reset to default proc_sched_rt01.c:51: TPASS: echo 0 > /proc/sys/kernel/sched_rt_period_us : EINVAL (22) proc_sched_rt01.c:53: TFAIL: echo -1 > /proc/sys/kernel/sched_rt_period_us invalid retval 2: SUCCESS (0) proc_sched_rt01.c:59: TPASS: echo -2 > /proc/sys/kernel/sched_rt_runtime_us : EINVAL (22) proc_sched_rt01.c:72: TFAIL: echo rt_period_us+1 > /proc/sys/kernel/sched_rt_runtime_us invalid retval 1: SUCCESS (0) HINT: You _MAY_ be missing kernel fixes: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c1fc6484e1fb https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=079be8fc6309 [Fix] There are 3 relevant commits from upstream. 1. 079be8fc6309 sched/rt: Disallow writing invalid values to sched_rt_period_us 2. c1fc6484e1fb sched/rt: sysctl_sched_rr_timeslice show default timeslice after reset 3. c7fcb99877f9 sched/rt: Fix sysctl_sched_rr_timeslice intial value Mantic: the 3rd is already in master-next. Jammy: stable v5.15.150 includes the three commits. Focal: master-next has include them after update to v5.4.270 Bionic: all the three commits are needed. [Test case] Run LTP update 20240312 to check the log of proc_sched_rt01. [Regression potential] Low risk since these content are existed in upstream for a while. Cyril Hrubis (2): sched/rt: sysctl_sched_rr_timeslice show default timeslice after reset sched/rt: Disallow writing invalid values to sched_rt_period_us kernel/sched/rt.c | 12 1 file changed, 8 insertions(+), 4 deletions(-) ** Also affects: linux (Ubuntu) Importance: Undecided Status: New ** Also affects: linux (Ubuntu Mantic) Importance: Undecided Status: New ** Changed in: linux (Ubuntu Mantic) Milestone: None => mantic-updates ** No longer affects: linux (Ubuntu Mantic) ** Changed in: linux (Ubuntu) Milestone: None => mantic-updates -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2057734 Title: proc_sched_rt01 from ubuntu_ltp failed Status in ubuntu-kernel-tests: New Status in linux package in Ubuntu: New Bug description: This is a new test case, issue found on M/J/F/B when testing LTP update 20240312 Test log: INFO: Test start time: Tue Mar 12 11:52:21 UTC 2024 COMMAND:/opt/ltp/bin/ltp-pan -q -e -S -a 163430 -n 163430 -f /tmp/ltp-X3Nz2HWCQe/alltests -l /dev/null -C /dev/null -T /dev/null LOG File: /dev/null FAILED COMMAND File: /dev/null TCONF COMMAND File: /dev/null Running tests... tst_kconfig.c:87: TINFO: Parsing kernel config '/lib/modules/6.5.0-27-generic/build/.config' tst_test.c:1741: TINFO: LTP version: 20230929-406-gcbc2d0568 tst_test.c:1625: TINFO: Timeout per run is 0h 00m 30s proc_sched_rt01.c:45: TFAIL: Expect: timeslice_ms > 0 after reset to default proc_sched_rt01.c:51: TPASS: echo 0 > /proc/sys/kernel/sched_rt_period_us : EINVAL (22) proc_sched_rt01.c:53: TFAIL: echo -1 > /proc/sys/kernel/sched_rt_period_us invalid retval 2: SUCCESS (0) proc_sched_rt01.c:59: TPASS: echo -2 > /proc/sys/kernel/sched_rt_runtime_us : EINVAL (22) proc_sched_rt01.c:72: TFAIL: echo rt_period_us+1 > /proc/sys/kernel/sched_rt_runtime_us invalid retval 1: SUCCESS (0) HINT: You _MAY_ be missing kernel fixes: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c1fc6484e1fb https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=079be8fc6309 Summary: passed 2 failed 3 broken 0 skipped 0 warnings 0 INFO: ltp-pan reported some tests FAIL LTP Version: 20230929-406-gcbc2d0568 INFO: Test end time: Tue Mar 12 11:52:21 UTC 2024 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/2057734/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2060217] Re: NFSv4 fails to mount in noble/s390x
So mount v4 returns 32 (probably means EPIPE) but v3 returns 0. ubuntu@nfs:~$ sudo strace -e mount mount localhost:/home /mnt/nfs_home -o vers=4 mount.nfs: mount system call failed for /mnt/nfs_home --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=1830, si_uid=0, si_status=32, si_utime=0, si_stime=0} --- +++ exited with 32 +++ ubuntu@nfs:~$ sudo strace -e mount mount localhost:/home /mnt/nfs_home -o vers=3 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=1850, si_uid=0, si_status=0, si_utime=0, si_stime=0} --- +++ exited with 0 +++ Could I know the kernel version which use for below test? Seems it was tested on 20240302. https://autopkgtest.ubuntu.com/results/autopkgtest-noble/noble/s390x/n/nfs-utils/20240302_020943_bf464@/log.gz And I tried with mainline/v6.8 kernel which has the same issue. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2060217 Title: NFSv4 fails to mount in noble/s390x Status in Ubuntu on IBM z Systems: New Status in linux package in Ubuntu: New Status in nfs-utils package in Ubuntu: Triaged Bug description: https://autopkgtest.ubuntu.com/packages/n/nfs-utils/noble/s390x Looks like it has been failing for a long time already. Log: https://autopkgtest.ubuntu.com/results/autopkgtest- noble/noble/s390x/n/nfs-utils/20240404_145924_ef255@/log.gz 339s autopkgtest [14:41:04]: test local-server-client: [--- 340s Killed 340s autopkgtest [14:41:05]: test process requested reboot with marker boot1 364s autopkgtest-virt-ssh: WARNING: ssh connection failed. Retrying in 3 seconds... 372s FAIL: nfs_home not mounted 373s autopkgtest [14:41:38]: test local-server-client: ---] 373s local-server-client FAIL non-zero exit status 1 and 934s autopkgtest [14:50:59]: test kerberos-mount: [--- 935s Initializing database '/var/lib/krb5kdc/principal' for realm 'DEP8', 935s master key name 'K/M@DEP8' 935s Authenticating as principal root/admin@DEP8 with password. 935s Principal "nfs/nfs-server.dep8@DEP8" created. 935s Authenticating as principal root/admin@DEP8 with password. 935s Entry for principal nfs/nfs-server.dep8 with kvno 2, encryption type aes256-cts-hmac-sha1-96 added to keytab FILE:/etc/krb5.keytab. 935s Entry for principal nfs/nfs-server.dep8 with kvno 2, encryption type aes128-cts-hmac-sha1-96 added to keytab FILE:/etc/krb5.keytab. 935s Authenticating as principal root/admin@DEP8 with password. 935s Principal "host/nfs-server.dep8@DEP8" created. 935s Authenticating as principal root/admin@DEP8 with password. 935s Entry for principal host/nfs-server.dep8 with kvno 2, encryption type aes256-cts-hmac-sha1-96 added to keytab FILE:/etc/krb5.keytab. 935s Entry for principal host/nfs-server.dep8 with kvno 2, encryption type aes128-cts-hmac-sha1-96 added to keytab FILE:/etc/krb5.keytab. 936s exporting *:/storage 938s mount.nfs: mount system call failed for /mnt 938s umount: /mnt: not mounted. 938s autopkgtest [14:51:02]: test kerberos-mount: ---] 939s kerberos-mount FAIL non-zero exit status 32 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-z-systems/+bug/2060217/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2060217] Re: NFSv4 fails to mount in noble/s390x
Hi Andreas, I'd like to take a look, could you please let me know how to access your vm? Because I don't have s390 hw. BTW, seems it is fine for both amd64 and arm64. https://autopkgtest.ubuntu.com/packages/n/nfs-utils/noble/amd64 https://autopkgtest.ubuntu.com/packages/n/nfs-utils/noble/arm64 And from the test history of s390, nfs-utils/1:2.6.4-3ubuntu3 can pass. https://autopkgtest.ubuntu.com/results/autopkgtest- noble/noble/s390x/n/nfs-utils/20240302_020943_bf464@/log.gz But nfs-utils/1:2.6.4-3ubuntu4 failed for both local-server-client and kerberos-mount test. https://autopkgtest.ubuntu.com/results/autopkgtest- noble/noble/s390x/n/nfs-utils/20240331_152208_be4f4@/log.gz I cloned nfs-utils repo though I can't find anything strange per git log. commit 5caa7491375e1e81012dcf1565d2e73c30f6f085 (tag: import/1%2.6.4-3ubuntu4, origin/ubuntu/noble) Author: Steve Langasek Date: Sun Mar 31 08:10:14 2024 + 1:2.6.4-3ubuntu4 (patches unapplied) Imported using git-ubuntu import. commit 86e924b7a8f68924304db261455cfff593cd5516 (tag: import/1%2.6.4-3ubuntu3) Author: Steve Langasek Date: Thu Feb 29 09:30:58 2024 + 1:2.6.4-3ubuntu3 (patches unapplied) Imported using git-ubuntu import. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2060217 Title: NFSv4 fails to mount in noble/s390x Status in linux package in Ubuntu: New Status in nfs-utils package in Ubuntu: Triaged Bug description: https://autopkgtest.ubuntu.com/packages/n/nfs-utils/noble/s390x Looks like it has been failing for a long time already. Log: https://autopkgtest.ubuntu.com/results/autopkgtest- noble/noble/s390x/n/nfs-utils/20240404_145924_ef255@/log.gz 339s autopkgtest [14:41:04]: test local-server-client: [--- 340s Killed 340s autopkgtest [14:41:05]: test process requested reboot with marker boot1 364s autopkgtest-virt-ssh: WARNING: ssh connection failed. Retrying in 3 seconds... 372s FAIL: nfs_home not mounted 373s autopkgtest [14:41:38]: test local-server-client: ---] 373s local-server-client FAIL non-zero exit status 1 and 934s autopkgtest [14:50:59]: test kerberos-mount: [--- 935s Initializing database '/var/lib/krb5kdc/principal' for realm 'DEP8', 935s master key name 'K/M@DEP8' 935s Authenticating as principal root/admin@DEP8 with password. 935s Principal "nfs/nfs-server.dep8@DEP8" created. 935s Authenticating as principal root/admin@DEP8 with password. 935s Entry for principal nfs/nfs-server.dep8 with kvno 2, encryption type aes256-cts-hmac-sha1-96 added to keytab FILE:/etc/krb5.keytab. 935s Entry for principal nfs/nfs-server.dep8 with kvno 2, encryption type aes128-cts-hmac-sha1-96 added to keytab FILE:/etc/krb5.keytab. 935s Authenticating as principal root/admin@DEP8 with password. 935s Principal "host/nfs-server.dep8@DEP8" created. 935s Authenticating as principal root/admin@DEP8 with password. 935s Entry for principal host/nfs-server.dep8 with kvno 2, encryption type aes256-cts-hmac-sha1-96 added to keytab FILE:/etc/krb5.keytab. 935s Entry for principal host/nfs-server.dep8 with kvno 2, encryption type aes128-cts-hmac-sha1-96 added to keytab FILE:/etc/krb5.keytab. 936s exporting *:/storage 938s mount.nfs: mount system call failed for /mnt 938s umount: /mnt: not mounted. 938s autopkgtest [14:51:02]: test kerberos-mount: ---] 939s kerberos-mount FAIL non-zero exit status 32 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2060217/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2058477] Re: [Ubuntu 22.04.4/linux-image-6.5.0-26-generic] Kernel output "UBSAN: array-index-out-of-bounds in /build/linux-hwe-6.5-34pCLi/linux-hwe-6.5-6.5.0/drivers/net/hyperv/
[Impact] error message "UBSAN: array-index-out-of-bounds in drivers/net/hyperv/netvsc.c:1446:41" appears multiple times during boot for a Hyper-V environment. [Fix] Clean cherry-pick commit bb9b0e46b84 for Focal, Jammy and Mantic. [Test case] check the dmesg to see if there is the error message "UBSAN: array-index-out-of-bounds" [Regression Potential] DPDK which processes netvsc packets, so it might incompatible with ancient DPDK, but modern DPDK had already used flexible array member. ** Also affects: linux (Ubuntu Mantic) Importance: Undecided Status: New ** Also affects: linux (Ubuntu Focal) Importance: Undecided Status: New ** Also affects: linux (Ubuntu Jammy) Importance: Undecided Status: New -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2058477 Title: [Ubuntu 22.04.4/linux-image-6.5.0-26-generic] Kernel output "UBSAN: array-index-out-of-bounds in /build/linux-hwe-6.5-34pCLi/linux- hwe-6.5-6.5.0/drivers/net/hyperv/netvsc.c:1445:41" multiple times, especially during boot. Status in linux package in Ubuntu: New Status in linux source package in Focal: New Status in linux source package in Jammy: New Status in linux source package in Mantic: New Bug description: Overview: A newly installed Ubuntu Server 22.04.4 on a Hyper-V virtual machine outputs error message "UBSAN: array-index-out-of-bounds in /build/linux-hwe-6.5-34pCLi/linux- hwe-6.5-6.5.0/drivers/net/hyperv/netvsc.c:1446:41" multiple times, especially during boot. Reproducing steps: 1. Download ubuntu-22.04.4-live-server-amd64.iso 2. Create a Hyper-V virtual machine. 3. Install Ubuntu 22.04.4 Server on the VM with the downloaded iso normally. 4. Boot the machine. Additional Information: - Host machine: Windows 10 Pro 22H2 OS Build 19045.3758 - Hyper-V configuration version: 9.0 - The error message "UBSAN: array-index-out-of-bounds" is similar to https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2008157, but the drivers are different. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2058477/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2050032] Re: mpt3sas causes kernel stack trace
Maybe the series (https://lore.kernel.org/all/20230806170604.16143-2-ja...@equiv.tech/) is needed for linux-hwe-6.5. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-hwe-6.5 in Ubuntu. https://bugs.launchpad.net/bugs/2050032 Title: mpt3sas causes kernel stack trace Status in linux-hwe-6.5 package in Ubuntu: Confirmed Bug description: [ 22.989826] [ 22.989831] UBSAN: array-index-out-of-bounds in /build/linux-hwe-6.5-q7NZ0T/linux-hwe-6.5-6.5.0/drivers/scsi/mpt3sas/mpt3sas_scsih.c:4667:12 [ 22.989838] index 1 is out of range for type 'MPI2_EVENT_SAS_TOPO_PHY_ENTRY [1]' [ 22.989843] CPU: 23 PID: 0 Comm: swapper/23 Not tainted 6.5.0-14-generic #14~22.04.1-Ubuntu [ 22.989850] Hardware name: Supermicro H8DG6/H8DGi/H8DG6/H8DGi, BIOS 2.0b 03/01/2012 [ 22.989854] Call Trace: [ 22.989858] [ 22.989862] dump_stack_lvl+0x48/0x70 [ 22.989877] dump_stack+0x10/0x20 [ 22.989883] __ubsan_handle_out_of_bounds+0xc6/0x110 [ 22.989895] _scsih_check_topo_delete_events+0x2dc/0x350 [mpt3sas] [ 22.989962] mpt3sas_scsih_event_callback+0x21f/0x630 [mpt3sas] [ 22.990022] _base_async_event.isra.0+0x73/0x190 [mpt3sas] [ 22.990078] _base_process_reply_queue+0x3a0/0x720 [mpt3sas] [ 22.990133] _base_interrupt+0x4e/0x70 [mpt3sas] [ 22.990188] __handle_irq_event_percpu+0x4f/0x1c0 [ 22.990197] handle_irq_event+0x39/0x80 [ 22.990202] handle_edge_irq+0x8c/0x250 [ 22.990208] __common_interrupt+0x56/0x110 [ 22.990217] common_interrupt+0x9f/0xb0 [ 22.990224] [ 22.990226] [ 22.990228] asm_common_interrupt+0x27/0x40 [ 22.990239] RIP: 0010:cpuidle_idle_call+0xa2/0x190 [ 22.990248] Code: 00 4c 89 e2 4c 89 ee 48 89 df e8 c9 98 c1 00 4c 89 ee 48 89 df 89 c2 e8 9c a7 ff ff 65 48 8b 04 25 80 28 03 00 f0 80 48 02 20 <9c> 58 0f 1f 40 00 f6 c4 02 0f 84 8b 00 00 00 48 8b 45 d8 65 48 2b [ 22.990253] RSP: 0018:b627443a7eb0 EFLAGS: 0202 [ 22.990259] RAX: 9edde8178000 RBX: a58e52e0 RCX: [ 22.990263] RDX: RSI: RDI: [ 22.990266] RBP: b627443a7ee0 R08: R09: [ 22.990269] R10: R11: R12: 0001 [ 22.990272] R13: 9ed9ea75cc00 R14: R15: [ 22.990279] do_idle+0x82/0xf0 [ 22.990285] cpu_startup_entry+0x1d/0x20 [ 22.990290] start_secondary+0x129/0x160 [ 22.990300] secondary_startup_64_no_verify+0x17e/0x18b [ 22.990311] [ 22.99031 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-hwe-6.5/+bug/2050032/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 2058477] Re: [Ubuntu 22.04.4/linux-image-6.5.0-26-generic] Kernel output "UBSAN: array-index-out-of-bounds in /build/linux-hwe-6.5-34pCLi/linux-hwe-6.5-6.5.0/drivers/net/hyperv/
I think it was fixed by upstream commit bb9b0e46b84c ("hv: hyperv.h: Replace one-element array with flexible-array member"), need to double check. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2058477 Title: [Ubuntu 22.04.4/linux-image-6.5.0-26-generic] Kernel output "UBSAN: array-index-out-of-bounds in /build/linux-hwe-6.5-34pCLi/linux- hwe-6.5-6.5.0/drivers/net/hyperv/netvsc.c:1445:41" multiple times, especially during boot. Status in linux package in Ubuntu: New Bug description: Overview: A newly installed Ubuntu Server 22.04.4 on a Hyper-V virtual machine outputs error message "UBSAN: array-index-out-of-bounds in /build/linux-hwe-6.5-34pCLi/linux- hwe-6.5-6.5.0/drivers/net/hyperv/netvsc.c:1446:41" multiple times, especially during boot. Reproducing steps: 1. Download ubuntu-22.04.4-live-server-amd64.iso 2. Create a Hyper-V virtual machine. 3. Install Ubuntu 22.04.4 Server on the VM with the downloaded iso normally. 4. Boot the machine. Additional Information: - Host machine: Windows 10 Pro 22H2 OS Build 19045.3758 - Hyper-V configuration version: 9.0 - The error message "UBSAN: array-index-out-of-bounds" is similar to https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2008157, but the drivers are different. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2058477/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp