[Kernel-packages] [Bug 2061091] Re: Freezing user space processes failed after 20.008 seconds (1 tasks refusing to freeze, wq_busy=0)

2024-04-29 Thread GuoqingJiang
Thanks for the update, could you bisect which commit fixes the issue
since upstream 6.8 works?

https://git-scm.com/docs/git-bisect

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-signed-hwe-6.5 in Ubuntu.
https://bugs.launchpad.net/bugs/2061091

Title:
  Freezing user space processes failed after 20.008 seconds (1 tasks
  refusing to freeze, wq_busy=0)

Status in linux-signed-hwe-6.5 package in Ubuntu:
  New

Bug description:
  Sometimes, when trying to shut down or suspend my laptop, it gets
  stuck on the console screen. If I was suspending, it eventually gives
  up and goes back to the X session. During a shutdown it hangs forever
  and the only solution seems to be to force a reboot with Magic-SysRq.

  The following appears in `kern.log`:

  ```
  Apr 12 08:59:54 xps9320 kernel: [173172.510341] Freezing user space processes 
failed after 20.008 seconds (1 tasks refusing to freeze, wq_busy=0):
  Apr 12 08:59:54 xps9320 kernel: [173172.515669] task:wireplumber state:D 
stack:0 pid:2408  ppid:2398   flags:0x0006
  Apr 12 08:59:54 xps9320 kernel: [173172.518923] Call Trace:
  Apr 12 08:59:54 xps9320 kernel: [173172.521755]  
  Apr 12 08:59:54 xps9320 kernel: [173172.524099]  __schedule+0x2cb/0x750
  Apr 12 08:59:54 xps9320 kernel: [173172.526333]  schedule+0x63/0x110
  Apr 12 08:59:54 xps9320 kernel: [173172.528585]  
snd_power_ref_and_wait+0xe5/0x140 [snd]
  Apr 12 08:59:54 xps9320 kernel: [173172.530825]  ? 
__pfx_autoremove_wake_function+0x10/0x10
  Apr 12 08:59:54 xps9320 kernel: [173172.533103]  snd_ctl_elem_info+0x4f/0x1b0 
[snd]
  Apr 12 08:59:54 xps9320 kernel: [173172.535354]  
snd_ctl_elem_info_user+0x59/0xc0 [snd]
  Apr 12 08:59:54 xps9320 kernel: [173172.537598]  snd_ctl_ioctl+0x1d4/0x650 
[snd]
  Apr 12 08:59:54 xps9320 kernel: [173172.539846]  ? __fget_light+0xa5/0x120
  Apr 12 08:59:54 xps9320 kernel: [173172.542082]  __x64_sys_ioctl+0xa0/0xf0
  Apr 12 08:59:54 xps9320 kernel: [173172.544334]  do_syscall_64+0x58/0x90
  Apr 12 08:59:54 xps9320 kernel: [173172.546566]  ? 
syscall_exit_to_user_mode+0x37/0x60
  Apr 12 08:59:54 xps9320 kernel: [173172.548838]  ? do_syscall_64+0x67/0x90
  Apr 12 08:59:54 xps9320 kernel: [173172.551085]  ? do_syscall_64+0x67/0x90
  Apr 12 08:59:54 xps9320 kernel: [173172.553342]  ? 
syscall_exit_to_user_mode+0x37/0x60
  Apr 12 08:59:54 xps9320 kernel: [173172.96]  ? do_syscall_64+0x67/0x90
  Apr 12 08:59:54 xps9320 kernel: [173172.557828]  ? common_interrupt+0x54/0xb0
  Apr 12 08:59:54 xps9320 kernel: [173172.560075]  
entry_SYSCALL_64_after_hwframe+0x6e/0xd8
  Apr 12 08:59:54 xps9320 kernel: [173172.562321] RIP: 0033:0x70f3adb1a94f
  Apr 12 08:59:54 xps9320 kernel: [173172.564650] RSP: 002b:7ffef2072940 
EFLAGS: 0246 ORIG_RAX: 0010
  Apr 12 08:59:54 xps9320 kernel: [173172.566925] RAX: ffda RBX: 
7ffef20729b0 RCX: 70f3adb1a94f
  Apr 12 08:59:54 xps9320 kernel: [173172.569222] RDX: 7ffef20729b0 RSI: 
c1105511 RDI: 0022
  Apr 12 08:59:54 xps9320 kernel: [173172.571501] RBP: 7ffef2072b90 R08: 
65107d2b6070 R09: 0004
  Apr 12 08:59:54 xps9320 kernel: [173172.573762] R10: f014 R11: 
0246 R12: 65107d2ee100
  Apr 12 08:59:54 xps9320 kernel: [173172.576047] R13: 65107d1fb400 R14: 
7ffef2072b30 R15: 7ffef2072ad0
  Apr 12 08:59:54 xps9320 kernel: [173172.578308]  
  ```

  
  Another point of interest is that, when in this situation:
   * `lsusb` hangs after printing out a few lines
   * Outgoing SSH connections hang unless I clear SSH_AUTH_SOCK. gpg-agent 
appears to be trying to check for smartcards.

  So I guess there is some sort of deadlock in the USB subsystem which
  is causing all the other problems?

  ProblemType: Bug
  DistroRelease: Ubuntu 22.04
  Package: linux-image-6.5.0-27-generic 6.5.0-27.28~22.04.1
  ProcVersionSignature: Ubuntu 6.5.0-27.28~22.04.1-generic 6.5.13
  Uname: Linux 6.5.0-27-generic x86_64
  ApportVersion: 2.20.11-0ubuntu82.5
  Architecture: amd64
  CasperMD5CheckResult: pass
  CurrentDesktop: ubuntu:GNOME
  Date: Fri Apr 12 09:42:52 2024
  InstallationDate: Installed on 2022-06-28 (653 days ago)
  InstallationMedia: Xubuntu 22.04 LTS "Jammy Jellyfish" - Release amd64 
(20220419)
  SourcePackage: linux-signed-hwe-6.5
  UpgradeStatus: No upgrade log present (probably fresh install)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-signed-hwe-6.5/+bug/2061091/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2042363] Re: AIX 7.3 NFS client frequently returns an EIO error to an application when reading or writing to a file that has been locked with fcntl() on a Ubuntu 20.04 NFSV4 ser

2024-04-25 Thread GuoqingJiang
Per comment#23, the ip from AIX 7.2 client are:

9.20.120.127 name = adia.v6.hursley.ibm.com -- Primary
9.20.121.46 name = amberjack.v6.hursley.ibm.com ? Partner


And I searched the trace again with above ips, looks socket cc6f0db2 is 
created between 9.20.120.127 and nfs server, however it can also return EAGAIN.

duckseason kernel: [13254.724411] svc: socket cc6f0db2 
sendto([8485f39d 72... ], 72) = 72 (addr 9.20.120.127, port=1022)
...
duckseason kernel: [13254.724734] svc: socket cc6f0db2(inet 
c831762e), busy=0
duckseason kernel: [13254.724759] svc: server 728e82a2, pool 0, 
transport cc6f0db2, inuse=2
duckseason kernel: [13254.724761] svc: tcp_recv cc6f0db2 data 1 conn 0 
close 0
duckseason kernel: [13254.724765] svc: socket cc6f0db2 
recvfrom(b6708704, 4) = 4
duckseason kernel: [13254.724766] svc: TCP record, 168 bytes
duckseason kernel: [13254.724769] svc: socket cc6f0db2 
recvfrom(57dbced3, 4096) = 168
duckseason kernel: [13254.724771] svc: TCP final record (168 bytes)
duckseason kernel: [13254.724775] svc: svc_authenticate (1)
duckseason kernel: [13254.724779] svc: server ee62a401, pool 0, 
transport cc6f0db2, inuse=3
duckseason kernel: [13254.724780] svc: tcp_recv cc6f0db2 data 1 conn 0 
close 0
duckseason kernel: [13254.724783] svc: socket cc6f0db2 
recvfrom(b6708704, 4) = -11

And it is same for socket 3497acd5 which is used between
9.20.121.46 and nfs server.

duckseason kernel: [13254.802249] svc: socket 3497acd5 
sendto([86e5a045 72... ], 72) = 72 (addr 9.20.121.46, port=1020)
...
duckseason kernel: [13254.802533] svc: socket 3497acd5(inet 
72c9551d), busy=0
duckseason kernel: [13254.802571] svc: server 728e82a2, pool 0, 
transport 3497acd5, inuse=2
duckseason kernel: [13254.802573] svc: tcp_recv 3497acd5 data 1 conn 0 
close 0
duckseason kernel: [13254.802578] svc: socket 3497acd5 
recvfrom(77f9cf7c, 4) = 4
duckseason kernel: [13254.802579] svc: TCP record, 164 bytes
duckseason kernel: [13254.802583] svc: socket 3497acd5 
recvfrom(57dbced3, 4096) = 164
duckseason kernel: [13254.802585] svc: TCP final record (164 bytes)
duckseason kernel: [13254.802590] svc: svc_authenticate (1)
duckseason kernel: [13254.802596] svc: server ee62a401, pool 0, 
transport 3497acd5, inuse=3
duckseason kernel: [13254.802597] svc: tcp_recv 3497acd5 data 1 conn 0 
close 0
duckseason kernel: [13254.802599] svc: socket 3497acd5 
recvfrom(77f9cf7c, 4) = -11 

But since aix 7.2 client can work with the same server according to bug
description, I am curious why 7.2 client also return EAGAIN which is
same as 7.3 client, what am I missing?

Some questions/suggestion:

1. Did aix 7.3 nfs client work with previous kernel? If so, run "git bisect" to 
find which commit caused the issue.
2. Is it possible to try with latest 5.4 stable kernel as suggested in 
comment#1? Also try latest upstream kernel (6.9-rc5 at this time) as well.
3. Does increase lease time make difference?

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2042363

Title:
  AIX 7.3 NFS client frequently returns an EIO error to an application
  when reading or writing to a file that has been locked with fcntl() on
  a Ubuntu 20.04 NFSV4 server

Status in linux package in Ubuntu:
  New

Bug description:
  ---Problem Description---
  AIX 7.3 NFS client frequently returns an EIO error to an application when 
reading or writing to a file that has been locked with fcntl(). NFS server is 
Ubuntu 20.04.6 LTS, GNU/Linux 5.4.0-139-generic x86_64. The problem does not 
appear to affect other combinations of NFS client (including AIX 7.2) with this 
NFS server.

  The AIX team have indicated that the cause of the EIO is triggered by the NFS 
server returning a BAD_SEQID error which leads to the AIX NFS client 
incorrectly zeroing the stateid, which then leads to the NFS server returning a 
BAD_STATEID error and the NFS client then returns the EIO error. The AIX team 
would like to understand why the BAD_SEQID has been returned.
   
  ---uname output---
  Linux duckseason 5.4.0-156-generic #173-Ubuntu SMP Tue Jul 11 07:25:22 UTC 
2023 x86_64 x86_64 x86_64 GNU/Linux
   
  Machine Type = VMware ESXi Server 7.0 4 x Intel(R) Xeon(R) Gold 6348H CPU @ 
2.30GHz  

  ---Steps to Reproduce---
   We cannot offer a simple way to recreate the problem as it involves IBM MQ 
running on two primary machines (AIX) using the Ubuntu server for it's HA NFSv4 
storage.

  However, we can provide any requested trace or dumps from any or all
  of the involved machines.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2042363/+subscriptions


-- 
Mailing list: 

[Kernel-packages] [Bug 2051232] Re: kernel: BUG: Bad page state in process kworker

2024-04-23 Thread GuoqingJiang
Could you try the latest upstream kernel which convert dm-crypt's
tasklet to BH workqueue? I suppose the commit fb6ad4aec1d0 ("dm-crypt:
Convert from tasklet to BH workqueue") might resolve the issue.

And mantic master-next has disabled tasklets for dm-crypt.

https://git.launchpad.net/~ubuntu-
kernel/ubuntu/+source/linux/+git/mantic/commit/?h=master-
next=13104eddc76990dc3e4183cff050c9b6dc5e859e

I suppose hwe-6.5 will sync from mantic later, so please try with the
newer kernel.

BTW, could you share how to reproduce the issue? I can try from my side
in case above commits doesn't help.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-hwe-6.5 in Ubuntu.
https://bugs.launchpad.net/bugs/2051232

Title:
  kernel: BUG: Bad page state in process kworker

Status in linux-hwe-6.5 package in Ubuntu:
  Confirmed

Bug description:
  Similar to the bug https://bugs.launchpad.net/ubuntu/+source/linux-
  hwe-6.5/+bug/2051123 where traces were shown, we observed a "BUG"
  being reported on yet another machine of the same make / model (Asus
  RS720A-E11-RS24U using dual socket AMD EPYC Milan CPUs):

  
  ```
  [...]
  Jan 24 08:57:00 fra-az1-comp-24 kernel: BUG: Bad page state in process 
kworker/u257:18  pfn:5812dc
  Jan 24 08:57:00 fra-az1-comp-24 kernel: page:b0c63dd1 refcount:-1 
mapcount:0 mapping: index:0x0 pfn:0x5812dc
  Jan 24 08:57:00 fra-az1-comp-24 kernel: flags: 
0x17c000(node=0|zone=2|lastcpupid=0x1f)
  Jan 24 08:57:00 fra-az1-comp-24 kernel: page_type: 0x()
  Jan 24 08:57:00 fra-az1-comp-24 kernel: raw: 0017c000 
dead0100 dead0122 
  Jan 24 08:57:00 fra-az1-comp-24 kernel: raw:  
  
  Jan 24 08:57:00 fra-az1-comp-24 kernel: page dumped because: nonzero _refcount
  Jan 24 08:57:00 fra-az1-comp-24 kernel: Modules linked in: vxlan 
ip6_udp_tunnel udp_tunnel ebt_arp nft_meta_bridge xt_CT xt_mac xt_state 
xt_comment xt_physdev vhost_net vhost vhost_iotlb tap xt_CHECKSUM xt_MASQUERADE 
xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat nft_chain_nat 
nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink xfrm_user 
xfrm_algo nvme_fabrics 8021q garp mrp br_netfilter bridge stp llc bonding 
binfmt_misc tls nls_ascii ipmi_ssif intel_rapl_msr intel_rapl_
  common amd64_edac edac_mce_amd kvm_amd kvm irqbypass rapl wmi_bmof irdma 
ib_uverbs ib_core joydev input_leds ccp k10temp ptdma switchtec acpi_ipmi 
ipmi_si ipmi_devintf ipmi_msghandler mac_hid sch_fq_codel efi_pstore ip_tables 
x_tables autofs4 dm_crypt raid10 raid456 async_raid6_recov async_memcpy 
async_pq async_xor async_tx xor raid6_pq libcrc32c raid0 multipath linear 
hid_generic usbhid hid cdc_ether usbnet mii ast i2c_algo_bit drm_shmem_helper 
raid1 drm_kms_helper ice crct10dif_pclmul crc32_pclmul po
  lyval_clmulni polyval_generic
  Jan 24 08:57:00 fra-az1-comp-24 kernel:  ghash_clmulni_intel aesni_intel 
crypto_simd cryptd ahci nvme gnss drm i40e libahci xhci_pci i2c_piix4 
xhci_pci_renesas nvme_core nvme_common wmi
  Jan 24 08:57:00 fra-az1-comp-24 kernel: CPU: 14 PID: 1094271 Comm: 
kworker/u257:18 Not tainted 6.5.0-14-generic #14~22.04.1-Ubuntu
  Jan 24 08:57:00 fra-az1-comp-24 kernel: Hardware name: To be filled by O.E.M. 
To be filled by O.E.M./KMPP-D32 Series, BIOS 1501 08/23/2023
  Jan 24 08:57:00 fra-az1-comp-24 kernel: Workqueue: kcryptd/252:12 
kcryptd_crypt [dm_crypt]
  Jan 24 08:57:00 fra-az1-comp-24 kernel: Call Trace:
  Jan 24 08:57:00 fra-az1-comp-24 kernel:  
  Jan 24 08:57:00 fra-az1-comp-24 kernel:  dump_stack_lvl+0x48/0x70
  Jan 24 08:57:00 fra-az1-comp-24 kernel:  dump_stack+0x10/0x20
  Jan 24 08:57:00 fra-az1-comp-24 kernel:  bad_page+0x76/0x120
  Jan 24 08:57:00 fra-az1-comp-24 kernel:  __rmqueue_pcplist+0x149/0x1d0
  Jan 24 08:57:00 fra-az1-comp-24 kernel:  ? srso_alias_return_thunk+0x5/0x7f
  Jan 24 08:57:00 fra-az1-comp-24 kernel:  rmqueue+0x37c/0xf10
  Jan 24 08:57:00 fra-az1-comp-24 kernel:  get_page_from_freelist+0x10b/0x4c0
  Jan 24 08:57:00 fra-az1-comp-24 kernel:  ? srso_alias_return_thunk+0x5/0x7f
  Jan 24 08:57:00 fra-az1-comp-24 kernel:  __alloc_pages+0x1e7/0x350
  Jan 24 08:57:00 fra-az1-comp-24 kernel:  alloc_pages+0x90/0x1a0
  Jan 24 08:57:00 fra-az1-comp-24 kernel:  crypt_page_alloc+0x2f/0x70 [dm_crypt]
  Jan 24 08:57:00 fra-az1-comp-24 kernel:  mempool_alloc+0x83/0x1c0
  Jan 24 08:57:00 fra-az1-comp-24 kernel:  ? srso_alias_return_thunk+0x5/0x7f
  Jan 24 08:57:00 fra-az1-comp-24 kernel:  crypt_alloc_buffer+0x11a/0x1f0 
[dm_crypt]
  Jan 24 08:57:00 fra-az1-comp-24 kernel:  
kcryptd_crypt_write_convert+0xa3/0x1d0 [dm_crypt]
  Jan 24 08:57:00 fra-az1-comp-24 kernel:  kcryptd_crypt+0x114/0x170 [dm_crypt]
  Jan 24 08:57:00 fra-az1-comp-24 kernel:  process_one_work+0x240/0x450
  Jan 24 08:57:00 fra-az1-comp-24 kernel:  worker_thread+0x50/0x3f0
  Jan 24 08:57:00 fra-az1-comp-24 kernel:  ? 

[Kernel-packages] [Bug 2051123] Re: Kernel traces leading to crash - refcount_t: underflow; use-after-free and refcount_t: saturated; leaking memory -- lib/refcount.c

2024-04-23 Thread GuoqingJiang
Err, the comments (#5 and #6) are for lp#2051232, sorry for confusion!

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-hwe-6.5 in Ubuntu.
https://bugs.launchpad.net/bugs/2051123

Title:
  Kernel traces leading to crash - refcount_t: underflow; use-after-free
  and refcount_t: saturated; leaking memory  --  lib/refcount.c

Status in linux-hwe-6.5 package in Ubuntu:
  Confirmed

Bug description:
  A few hours after upgrading a machine serving as VM hypervisor running
  OpenStack Nova + libvirt from linux kernel 6.2.0-37-generic to
  6.5.0-14-generic we observed kernel traces and quick disintegration of
  the system and its various processes.

  While the TCP connection itself was accepted, we were unable to log in via 
SSH anymore or use the console.
  A hard reset was required to get the machine back up. We went back to the 
former HWE kernel version, 6.2.0-37-generic, and have not observed any issues 
since.


  
  Attached is all of the kernel log from bootup to the crash - this is where 
the issues started ...

  ```
  [...]
  Jan 23 11:36:13 fra-az1-comp-21 kernel: vxlan-304: fa:16:3e:7f:e2:6f migrated 
from 10.101.11.98 to 10.101.11.101
  Jan 23 11:41:05 fra-az1-comp-21 kernel: hrtimer: interrupt took 32482 ns
  Jan 23 12:02:41 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on 
CPU50: hpet wd-wd read-back delay of 245561ns
  Jan 23 12:02:41 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back 
delay of 245561ns, clock-skew test skipped!
  Jan 23 12:18:18 fra-az1-comp-21 kernel: perf: interrupt took too long (2509 > 
2500), lowering kernel.perf_event_max_sample_rate to 79500
  Jan 23 12:44:35 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on 
CPU127: hpet wd-wd read-back delay of 244863ns
  Jan 23 12:44:35 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back 
delay of 245352ns, clock-skew test skipped!
  Jan 23 13:08:56 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on 
CPU16: hpet wd-wd read-back delay of 243257ns
  Jan 23 13:08:56 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back 
delay of 247517ns, clock-skew test skipped!
  Jan 23 14:13:58 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on 
CPU52: hpet wd-wd read-back delay of 248076ns
  Jan 23 14:13:58 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back 
delay of 245142ns, clock-skew test skipped!
  Jan 23 14:31:18 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on 
CPU124: hpet wd-wd read-back delay of 245073ns
  Jan 23 14:31:18 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back 
delay of 231034ns, clock-skew test skipped!
  Jan 23 15:13:52 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on 
CPU24: hpet wd-wd read-back delay of 244863ns
  Jan 23 15:13:52 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back 
delay of 245701ns, clock-skew test skipped!
  Jan 23 15:35:18 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on 
CPU76: hpet wd-wd read-back delay of 245282ns
  Jan 23 15:35:18 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back 
delay of 245841ns, clock-skew test skipped!
  Jan 23 16:10:49 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on 
CPU27: hpet wd-wd read-back delay of 244653ns
  Jan 23 16:10:49 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back 
delay of 245980ns, clock-skew test skipped!
  Jan 23 16:12:49 fra-az1-comp-21 kernel: workqueue: drain_vmap_area_work 
hogged CPU for >1us 4 times, consider switching to WQ_UNBOUND
  Jan 23 16:20:56 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on 
CPU0: hpet wd-wd read-back delay of 242907ns
  Jan 23 16:20:56 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back 
delay of 247796ns, clock-skew test skipped!
  Jan 23 16:25:04 fra-az1-comp-21 kernel: [ cut here ]
  Jan 23 16:25:04 fra-az1-comp-21 kernel: refcount_t: underflow; use-after-free.
  Jan 23 16:25:04 fra-az1-comp-21 kernel: WARNING: CPU: 84 PID: 7072 at 
lib/refcount.c:28 refcount_warn_saturate+0xa3/0x150
  Jan 23 16:25:04 fra-az1-comp-21 kernel: Modules linked in: xt_multiport 
ebt_arp nft_meta_bridge xt_CT xt_mac xt_set xt_state ip_set_hash_net ip_set 
vhost_net vhost vhost_iotlb tap xt_policy xt_REDIRECT xt_nat xt_connmark 
xt_mark vxlan ip6_udp_tunnel udp_tunnel xt_comment xt_physdev veth xt_CHECKSUM 
xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat 
nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables 
nfnetlink xfrm_user xfrm_algo nvme_fabrics 8021q garp mrp br_netfilter bridge 
stp llc bonding binfmt_misc tls nls_ascii ipmi_ssif intel_rapl_msr 
intel_rapl_common amd64_edac edac_mce_amd kvm_amd kvm irqbypass rapl wmi_bmof 
irdma ib_uverbs ib_core joydev input_leds ccp k10temp ptdma switchtec acpi_ipmi 
ipmi_si ipmi_devintf ipmi_msghandler mac_hid sch_fq_codel efi_pstore ip_tables 
x_tables autofs4 dm_crypt raid10 raid456 async_raid6_recov async_memcpy 

[Kernel-packages] [Bug 2051123] Re: Kernel traces leading to crash - refcount_t: underflow; use-after-free and refcount_t: saturated; leaking memory -- lib/refcount.c

2024-04-23 Thread GuoqingJiang
Just noticed master-next has disabled tasklets for dm-crypt.

https://git.launchpad.net/~ubuntu-
kernel/ubuntu/+source/linux/+git/mantic/commit/?h=master-
next=13104eddc76990dc3e4183cff050c9b6dc5e859e

I suppose hwe-6.5 will sync from mantic later, so please try with the
newer kernel.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-hwe-6.5 in Ubuntu.
https://bugs.launchpad.net/bugs/2051123

Title:
  Kernel traces leading to crash - refcount_t: underflow; use-after-free
  and refcount_t: saturated; leaking memory  --  lib/refcount.c

Status in linux-hwe-6.5 package in Ubuntu:
  Confirmed

Bug description:
  A few hours after upgrading a machine serving as VM hypervisor running
  OpenStack Nova + libvirt from linux kernel 6.2.0-37-generic to
  6.5.0-14-generic we observed kernel traces and quick disintegration of
  the system and its various processes.

  While the TCP connection itself was accepted, we were unable to log in via 
SSH anymore or use the console.
  A hard reset was required to get the machine back up. We went back to the 
former HWE kernel version, 6.2.0-37-generic, and have not observed any issues 
since.


  
  Attached is all of the kernel log from bootup to the crash - this is where 
the issues started ...

  ```
  [...]
  Jan 23 11:36:13 fra-az1-comp-21 kernel: vxlan-304: fa:16:3e:7f:e2:6f migrated 
from 10.101.11.98 to 10.101.11.101
  Jan 23 11:41:05 fra-az1-comp-21 kernel: hrtimer: interrupt took 32482 ns
  Jan 23 12:02:41 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on 
CPU50: hpet wd-wd read-back delay of 245561ns
  Jan 23 12:02:41 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back 
delay of 245561ns, clock-skew test skipped!
  Jan 23 12:18:18 fra-az1-comp-21 kernel: perf: interrupt took too long (2509 > 
2500), lowering kernel.perf_event_max_sample_rate to 79500
  Jan 23 12:44:35 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on 
CPU127: hpet wd-wd read-back delay of 244863ns
  Jan 23 12:44:35 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back 
delay of 245352ns, clock-skew test skipped!
  Jan 23 13:08:56 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on 
CPU16: hpet wd-wd read-back delay of 243257ns
  Jan 23 13:08:56 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back 
delay of 247517ns, clock-skew test skipped!
  Jan 23 14:13:58 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on 
CPU52: hpet wd-wd read-back delay of 248076ns
  Jan 23 14:13:58 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back 
delay of 245142ns, clock-skew test skipped!
  Jan 23 14:31:18 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on 
CPU124: hpet wd-wd read-back delay of 245073ns
  Jan 23 14:31:18 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back 
delay of 231034ns, clock-skew test skipped!
  Jan 23 15:13:52 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on 
CPU24: hpet wd-wd read-back delay of 244863ns
  Jan 23 15:13:52 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back 
delay of 245701ns, clock-skew test skipped!
  Jan 23 15:35:18 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on 
CPU76: hpet wd-wd read-back delay of 245282ns
  Jan 23 15:35:18 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back 
delay of 245841ns, clock-skew test skipped!
  Jan 23 16:10:49 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on 
CPU27: hpet wd-wd read-back delay of 244653ns
  Jan 23 16:10:49 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back 
delay of 245980ns, clock-skew test skipped!
  Jan 23 16:12:49 fra-az1-comp-21 kernel: workqueue: drain_vmap_area_work 
hogged CPU for >1us 4 times, consider switching to WQ_UNBOUND
  Jan 23 16:20:56 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on 
CPU0: hpet wd-wd read-back delay of 242907ns
  Jan 23 16:20:56 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back 
delay of 247796ns, clock-skew test skipped!
  Jan 23 16:25:04 fra-az1-comp-21 kernel: [ cut here ]
  Jan 23 16:25:04 fra-az1-comp-21 kernel: refcount_t: underflow; use-after-free.
  Jan 23 16:25:04 fra-az1-comp-21 kernel: WARNING: CPU: 84 PID: 7072 at 
lib/refcount.c:28 refcount_warn_saturate+0xa3/0x150
  Jan 23 16:25:04 fra-az1-comp-21 kernel: Modules linked in: xt_multiport 
ebt_arp nft_meta_bridge xt_CT xt_mac xt_set xt_state ip_set_hash_net ip_set 
vhost_net vhost vhost_iotlb tap xt_policy xt_REDIRECT xt_nat xt_connmark 
xt_mark vxlan ip6_udp_tunnel udp_tunnel xt_comment xt_physdev veth xt_CHECKSUM 
xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat 
nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables 
nfnetlink xfrm_user xfrm_algo nvme_fabrics 8021q garp mrp br_netfilter bridge 
stp llc bonding binfmt_misc tls nls_ascii ipmi_ssif intel_rapl_msr 
intel_rapl_common amd64_edac edac_mce_amd kvm_amd kvm irqbypass rapl wmi_bmof 
irdma ib_uverbs 

[Kernel-packages] [Bug 2051123] Re: Kernel traces leading to crash - refcount_t: underflow; use-after-free and refcount_t: saturated; leaking memory -- lib/refcount.c

2024-04-22 Thread GuoqingJiang
Could you try the latest upstream kernel which convert dm-crypt's
tasklet to BH workqueue? I suppose the commit fb6ad4aec1d0 ("dm-crypt:
Convert from tasklet to BH workqueue") might resolve the issue.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-hwe-6.5 in Ubuntu.
https://bugs.launchpad.net/bugs/2051123

Title:
  Kernel traces leading to crash - refcount_t: underflow; use-after-free
  and refcount_t: saturated; leaking memory  --  lib/refcount.c

Status in linux-hwe-6.5 package in Ubuntu:
  Confirmed

Bug description:
  A few hours after upgrading a machine serving as VM hypervisor running
  OpenStack Nova + libvirt from linux kernel 6.2.0-37-generic to
  6.5.0-14-generic we observed kernel traces and quick disintegration of
  the system and its various processes.

  While the TCP connection itself was accepted, we were unable to log in via 
SSH anymore or use the console.
  A hard reset was required to get the machine back up. We went back to the 
former HWE kernel version, 6.2.0-37-generic, and have not observed any issues 
since.


  
  Attached is all of the kernel log from bootup to the crash - this is where 
the issues started ...

  ```
  [...]
  Jan 23 11:36:13 fra-az1-comp-21 kernel: vxlan-304: fa:16:3e:7f:e2:6f migrated 
from 10.101.11.98 to 10.101.11.101
  Jan 23 11:41:05 fra-az1-comp-21 kernel: hrtimer: interrupt took 32482 ns
  Jan 23 12:02:41 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on 
CPU50: hpet wd-wd read-back delay of 245561ns
  Jan 23 12:02:41 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back 
delay of 245561ns, clock-skew test skipped!
  Jan 23 12:18:18 fra-az1-comp-21 kernel: perf: interrupt took too long (2509 > 
2500), lowering kernel.perf_event_max_sample_rate to 79500
  Jan 23 12:44:35 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on 
CPU127: hpet wd-wd read-back delay of 244863ns
  Jan 23 12:44:35 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back 
delay of 245352ns, clock-skew test skipped!
  Jan 23 13:08:56 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on 
CPU16: hpet wd-wd read-back delay of 243257ns
  Jan 23 13:08:56 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back 
delay of 247517ns, clock-skew test skipped!
  Jan 23 14:13:58 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on 
CPU52: hpet wd-wd read-back delay of 248076ns
  Jan 23 14:13:58 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back 
delay of 245142ns, clock-skew test skipped!
  Jan 23 14:31:18 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on 
CPU124: hpet wd-wd read-back delay of 245073ns
  Jan 23 14:31:18 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back 
delay of 231034ns, clock-skew test skipped!
  Jan 23 15:13:52 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on 
CPU24: hpet wd-wd read-back delay of 244863ns
  Jan 23 15:13:52 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back 
delay of 245701ns, clock-skew test skipped!
  Jan 23 15:35:18 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on 
CPU76: hpet wd-wd read-back delay of 245282ns
  Jan 23 15:35:18 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back 
delay of 245841ns, clock-skew test skipped!
  Jan 23 16:10:49 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on 
CPU27: hpet wd-wd read-back delay of 244653ns
  Jan 23 16:10:49 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back 
delay of 245980ns, clock-skew test skipped!
  Jan 23 16:12:49 fra-az1-comp-21 kernel: workqueue: drain_vmap_area_work 
hogged CPU for >1us 4 times, consider switching to WQ_UNBOUND
  Jan 23 16:20:56 fra-az1-comp-21 kernel: clocksource: timekeeping watchdog on 
CPU0: hpet wd-wd read-back delay of 242907ns
  Jan 23 16:20:56 fra-az1-comp-21 kernel: clocksource: wd-tsc-wd read-back 
delay of 247796ns, clock-skew test skipped!
  Jan 23 16:25:04 fra-az1-comp-21 kernel: [ cut here ]
  Jan 23 16:25:04 fra-az1-comp-21 kernel: refcount_t: underflow; use-after-free.
  Jan 23 16:25:04 fra-az1-comp-21 kernel: WARNING: CPU: 84 PID: 7072 at 
lib/refcount.c:28 refcount_warn_saturate+0xa3/0x150
  Jan 23 16:25:04 fra-az1-comp-21 kernel: Modules linked in: xt_multiport 
ebt_arp nft_meta_bridge xt_CT xt_mac xt_set xt_state ip_set_hash_net ip_set 
vhost_net vhost vhost_iotlb tap xt_policy xt_REDIRECT xt_nat xt_connmark 
xt_mark vxlan ip6_udp_tunnel udp_tunnel xt_comment xt_physdev veth xt_CHECKSUM 
xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat 
nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables 
nfnetlink xfrm_user xfrm_algo nvme_fabrics 8021q garp mrp br_netfilter bridge 
stp llc bonding binfmt_misc tls nls_ascii ipmi_ssif intel_rapl_msr 
intel_rapl_common amd64_edac edac_mce_amd kvm_amd kvm irqbypass rapl wmi_bmof 
irdma ib_uverbs ib_core joydev input_leds ccp k10temp ptdma switchtec acpi_ipmi 
ipmi_si ipmi_devintf 

[Kernel-packages] [Bug 2042363] Re: AIX 7.3 NFS client frequently returns an EIO error to an application when reading or writing to a file that has been locked with fcntl() on a Ubuntu 20.04 NFSV4 ser

2024-04-17 Thread GuoqingJiang
** Attachment added: "RENEW packets between 9.20.32.85 (server) and 
9.20.120.127 (7.2 client)"
   
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2042363/+attachment/5767206/+files/7.2nfs.png

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2042363

Title:
  AIX 7.3 NFS client frequently returns an EIO error to an application
  when reading or writing to a file that has been locked with fcntl() on
  a Ubuntu 20.04 NFSV4 server

Status in linux package in Ubuntu:
  New

Bug description:
  ---Problem Description---
  AIX 7.3 NFS client frequently returns an EIO error to an application when 
reading or writing to a file that has been locked with fcntl(). NFS server is 
Ubuntu 20.04.6 LTS, GNU/Linux 5.4.0-139-generic x86_64. The problem does not 
appear to affect other combinations of NFS client (including AIX 7.2) with this 
NFS server.

  The AIX team have indicated that the cause of the EIO is triggered by the NFS 
server returning a BAD_SEQID error which leads to the AIX NFS client 
incorrectly zeroing the stateid, which then leads to the NFS server returning a 
BAD_STATEID error and the NFS client then returns the EIO error. The AIX team 
would like to understand why the BAD_SEQID has been returned.
   
  ---uname output---
  Linux duckseason 5.4.0-156-generic #173-Ubuntu SMP Tue Jul 11 07:25:22 UTC 
2023 x86_64 x86_64 x86_64 GNU/Linux
   
  Machine Type = VMware ESXi Server 7.0 4 x Intel(R) Xeon(R) Gold 6348H CPU @ 
2.30GHz  

  ---Steps to Reproduce---
   We cannot offer a simple way to recreate the problem as it involves IBM MQ 
running on two primary machines (AIX) using the Ubuntu server for it's HA NFSv4 
storage.

  However, we can provide any requested trace or dumps from any or all
  of the involved machines.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2042363/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2042363] Re: AIX 7.3 NFS client frequently returns an EIO error to an application when reading or writing to a file that has been locked with fcntl() on a Ubuntu 20.04 NFSV4 ser

2024-04-17 Thread GuoqingJiang
** Attachment added: "packets for 9.20.32.85 (server) and 9.20.120.112 (7.3 
client)"
   
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2042363/+attachment/5767207/+files/7.3nfs.png

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2042363

Title:
  AIX 7.3 NFS client frequently returns an EIO error to an application
  when reading or writing to a file that has been locked with fcntl() on
  a Ubuntu 20.04 NFSV4 server

Status in linux package in Ubuntu:
  New

Bug description:
  ---Problem Description---
  AIX 7.3 NFS client frequently returns an EIO error to an application when 
reading or writing to a file that has been locked with fcntl(). NFS server is 
Ubuntu 20.04.6 LTS, GNU/Linux 5.4.0-139-generic x86_64. The problem does not 
appear to affect other combinations of NFS client (including AIX 7.2) with this 
NFS server.

  The AIX team have indicated that the cause of the EIO is triggered by the NFS 
server returning a BAD_SEQID error which leads to the AIX NFS client 
incorrectly zeroing the stateid, which then leads to the NFS server returning a 
BAD_STATEID error and the NFS client then returns the EIO error. The AIX team 
would like to understand why the BAD_SEQID has been returned.
   
  ---uname output---
  Linux duckseason 5.4.0-156-generic #173-Ubuntu SMP Tue Jul 11 07:25:22 UTC 
2023 x86_64 x86_64 x86_64 GNU/Linux
   
  Machine Type = VMware ESXi Server 7.0 4 x Intel(R) Xeon(R) Gold 6348H CPU @ 
2.30GHz  

  ---Steps to Reproduce---
   We cannot offer a simple way to recreate the problem as it involves IBM MQ 
running on two primary machines (AIX) using the Ubuntu server for it's HA NFSv4 
storage.

  However, we can provide any requested trace or dumps from any or all
  of the involved machines.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2042363/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2042363] Re: AIX 7.3 NFS client frequently returns an EIO error to an application when reading or writing to a file that has been locked with fcntl() on a Ubuntu 20.04 NFSV4 ser

2024-04-17 Thread GuoqingJiang
Sorry, I can't distinguish which parts of logs in the attachments
(#comment11, #comment12 and #comment13) are belong to the connection
from working 7.2 and non-working 7.3. All the attachments have "TCP
recvfrom got EAGAIN" which should from the connection for 7.3.

$ grep "TCP recvfrom got EAGAIN" 
syslog_16042024_amaliada_primary_adamsongrunter_partner_both_aix73_part1.log 
-r|wc -l
213127
$ grep "TCP recvfrom got EAGAIN" 
syslog_16042024_amaliada_primary_adamsongrunter_partner_both_aix73_part2.log 
-r|wc -l
226005
$ grep "TCP recvfrom got EAGAIN" 
syslog_17042024_adia_primary_amberjack_partner_both_aix72.log -r|wc -l
20233


May I suggest to collect those logs in two separated files? One from 7.2 and 
another from 7.3 instead of mix them together.

Not an network expert, but I see some NFS RENEW ops packets between
9.20.32.85 (server) and 9.20.120.127 (7.2 client) in
tcp_dump17_04_2024_09H_10M, but no such RENEW packets for 9.20.32.85
(server) and 9.20.120.112 (7.3 client) in tcpdump16_04_2024_14H_03M.
Given NFS4 is a stateful fs which is based on leases, without client
send an operation to renew the lease, it is possible for server to
return EAGAIN. And please check if 7.3 client is not same as 7.2 client
regarding lease renewing.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2042363

Title:
  AIX 7.3 NFS client frequently returns an EIO error to an application
  when reading or writing to a file that has been locked with fcntl() on
  a Ubuntu 20.04 NFSV4 server

Status in linux package in Ubuntu:
  New

Bug description:
  ---Problem Description---
  AIX 7.3 NFS client frequently returns an EIO error to an application when 
reading or writing to a file that has been locked with fcntl(). NFS server is 
Ubuntu 20.04.6 LTS, GNU/Linux 5.4.0-139-generic x86_64. The problem does not 
appear to affect other combinations of NFS client (including AIX 7.2) with this 
NFS server.

  The AIX team have indicated that the cause of the EIO is triggered by the NFS 
server returning a BAD_SEQID error which leads to the AIX NFS client 
incorrectly zeroing the stateid, which then leads to the NFS server returning a 
BAD_STATEID error and the NFS client then returns the EIO error. The AIX team 
would like to understand why the BAD_SEQID has been returned.
   
  ---uname output---
  Linux duckseason 5.4.0-156-generic #173-Ubuntu SMP Tue Jul 11 07:25:22 UTC 
2023 x86_64 x86_64 x86_64 GNU/Linux
   
  Machine Type = VMware ESXi Server 7.0 4 x Intel(R) Xeon(R) Gold 6348H CPU @ 
2.30GHz  

  ---Steps to Reproduce---
   We cannot offer a simple way to recreate the problem as it involves IBM MQ 
running on two primary machines (AIX) using the Ubuntu server for it's HA NFSv4 
storage.

  However, we can provide any requested trace or dumps from any or all
  of the involved machines.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2042363/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2053194] Re: latest kernel update breaks sata hotplug on z690

2024-04-17 Thread GuoqingJiang
Could someone do git bisect to check which commit could cause the regression? 
My wild guess would be
the commit which add Intel Alder Lake-P AHCI controller to low power chipsets 
list or others in the following.


$ git log --oneline Ubuntu-5.15.0-92.102..Ubuntu-5.15.0-94.104 |grep "ata: ahci"
c553eda3bac6 ata: ahci: Add Intel Alder Lake-P AHCI controller to low power 
chipsets list
0e2b3a2aa29d ata: ahci: Add Elkhart Lake AHCI controller
6fd0f4242184 ata: ahci: Rename board_ahci_mobile
65ecbaa1fe47 ata: ahci: Add support for AMD A85 FCH (Hudson D4)
c21705b5ee4f ata: ahci: Drop pointless VPRINTK() calls and convert the 
remaining ones

$ git log --oneline  
Ubuntu-hwe-6.5-6.5.0-15.15_22.04.1..Ubuntu-hwe-6.5-6.5.0-17.17_22.04.1|grep 
"ata: ahci"
df508bb822f9 ata: ahci: Add Intel Alder Lake-P AHCI controller to low power 
chipsets list
2a4dad1ecdf3 ata: ahci: Add Elkhart Lake AHCI controller

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-signed-hwe-6.5 in Ubuntu.
https://bugs.launchpad.net/bugs/2053194

Title:
  latest kernel update breaks sata hotplug on z690

Status in linux-signed-hwe-6.5 package in Ubuntu:
  Confirmed

Bug description:
  SATA hotplug is not working on these kernels:
  linux-image-6.5.0-17-generic
  linux-image-5.15.0-94-generic

  SATA hotplug is working on these kernels:
  linux-image-6.5.0-15-generic
  linux-image-5.15.0-92-generic

  Affected platform:
  Gigabyte Z690 AORUS PRO
  BIOS Version: F28

  Note that I can only repro this on Z690. I also tried Z390 and Z370
  platforms and SATA hotplug it is working there.

  Steps to repro:
  1. Enable SATA hotplug in BIOS
  2. Boot to affected kernel
  3. Hotplug a SATA drive while monitoring kernel messages

  ProblemType: Bug
  DistroRelease: Ubuntu 22.04
  Package: linux-image-6.5.0-17-generic 6.5.0-17.17~22.04.1
  ProcVersionSignature: Ubuntu 6.5.0-17.17~22.04.1-generic 6.5.8
  Uname: Linux 6.5.0-17-generic x86_64
  ApportVersion: 2.20.11-0ubuntu82.5
  Architecture: amd64
  CasperMD5CheckResult: pass
  CurrentDesktop: ubuntu:GNOME
  Date: Wed Feb 14 13:20:34 2024
  InstallationDate: Installed on 2024-01-26 (18 days ago)
  InstallationMedia: Ubuntu 22.04.3 LTS "Jammy Jellyfish" - Release amd64 
(20230807.2)
  ProcEnviron:
   TERM=xterm-256color
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  SourcePackage: linux-signed-hwe-6.5
  UpgradeStatus: No upgrade log present (probably fresh install)
  modified.conffile..etc.default.apport:
   # set this to 0 to disable apport, or to 1 to enable it
   # you can temporarily override this with
   # sudo service apport start force_start=1
   enabled=0
  mtime.conffile..etc.default.apport: 2024-01-26T15:01:36.495491

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-signed-hwe-6.5/+bug/2053194/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2061091] Re: Freezing user space processes failed after 20.008 seconds (1 tasks refusing to freeze, wq_busy=0)

2024-04-17 Thread GuoqingJiang
Looks suspend/resume can be completed sometimes.

Apr 15 12:49:26 xps9320 kernel: [10214.971688] Freezing user space processes 
completed (elapsed 0.002 seconds)
Apr 15 21:45:23 xps9320 kernel: [30355.137512] Freezing user space processes 
completed (elapsed 0.015 seconds)
Apr 15 23:02:59 xps9320 kernel: [47737.172847] Freezing user space processes 
failed after 20.006 seconds (1 tasks refusing to freeze, wq_busy=0):
Apr 15 23:03:19 xps9320 kernel: [47757.538206] Freezing user space processes 
failed after 20.001 seconds (1 tasks refusing to freeze, wq_busy=0):
Apr 15 23:03:49 xps9320 kernel: [47787.948230] Freezing user space processes 
failed after 20.010 seconds (1 tasks refusing to freeze, wq_busy=0):
Apr 15 23:04:10 xps9320 kernel: [47808.293913] Freezing user space processes 
failed after 20.004 seconds (1 tasks refusing to freeze, wq_busy=0):
Apr 15 23:04:40 xps9320 kernel: [47838.963650] Freezing user space processes 
failed after 20.011 seconds (1 tasks refusing to freeze, wq_busy=0):
Apr 15 23:05:01 xps9320 kernel: [47859.321047] Freezing user space processes 
failed after 20.007 seconds (1 tasks refusing to freeze, wq_busy=0):
Apr 15 23:05:31 xps9320 kernel: [47889.930764] Freezing user space processes 
failed after 20.007 seconds (1 tasks refusing to freeze, wq_busy=0):
Apr 15 23:05:52 xps9320 kernel: [47910.270362] Freezing user space processes 
failed after 20.008 seconds (1 tasks refusing to freeze, wq_busy=0):
Apr 15 23:06:22 xps9320 kernel: [47940.942995] Freezing user space processes 
failed after 20.004 seconds (1 tasks refusing to freeze, wq_busy=0):
Apr 15 23:06:43 xps9320 kernel: [47961.284299] Freezing user space processes 
failed after 20.004 seconds (1 tasks refusing to freeze, wq_busy=0):
Apr 15 23:07:13 xps9320 kernel: [47991.938673] Freezing user space processes 
failed after 20.003 seconds (1 tasks refusing to freeze, wq_busy=0):
Apr 15 23:07:34 xps9320 kernel: [48012.283148] Freezing user space processes 
failed after 20.004 seconds (1 tasks refusing to freeze, wq_busy=0):
Apr 15 23:07:54 xps9320 kernel: [48032.682674] Freezing user space processes 
failed after 20.001 seconds (1 tasks refusing to freeze, wq_busy=0):
Apr 15 23:08:15 xps9320 kernel: [48053.066382] Freezing user space processes 
failed after 20.001 seconds (1 tasks refusing to freeze, wq_busy=0):
Apr 15 23:08:37 xps9320 kernel: [48075.379470] Freezing user space processes 
failed after 20.006 seconds (1 tasks refusing to freeze, wq_busy=0):
Apr 15 23:08:57 xps9320 kernel: [48095.707188] Freezing user space processes 
failed after 20.001 seconds (1 tasks refusing to freeze, wq_busy=0):
Apr 15 23:09:18 xps9320 kernel: [48116.090956] Freezing user space processes 
failed after 20.007 seconds (1 tasks refusing to freeze, wq_busy=0):
Apr 15 23:09:38 xps9320 kernel: [48136.438119] Freezing user space processes 
failed after 20.006 seconds (1 tasks refusing to freeze, wq_busy=0):
Apr 15 23:09:59 xps9320 kernel: [48157.256494] Freezing user space processes 
failed after 20.002 seconds (1 tasks refusing to freeze, wq_busy=0):
Apr 15 23:10:19 xps9320 kernel: [48177.593855] Freezing user space processes 
failed after 20.006 seconds (1 tasks refusing to freeze, wq_busy=0):
Apr 15 23:10:41 xps9320 kernel: [48199.473829] Freezing user space processes 
failed after 20.004 seconds (1 tasks refusing to freeze, wq_busy=0):
Apr 15 23:11:01 xps9320 kernel: [48219.806060] Freezing user space processes 
failed after 20.006 seconds (1 tasks refusing to freeze, wq_busy=0):
Apr 15 23:11:22 xps9320 kernel: [48240.188072] Freezing user space processes 
failed after 20.011 seconds (1 tasks refusing to freeze, wq_busy=0):
Apr 15 23:11:42 xps9320 kernel: [48260.947835] Freezing user space processes 
failed after 20.005 seconds (1 tasks refusing to freeze, wq_busy=0):
Apr 16 12:52:21 xps9320 kernel: [13407.856663] Freezing user space processes 
completed (elapsed 0.002 seconds)

Could you rebuild kernel with CONFIG_PROVE_LOCKING option to discover
locking related deadlocks? Then upload the log after reproduce the issue
by shut down laptop with the new kernel. Also please attach the output
of lsusb given it could be usb relevant.

And it also would be helpful to try with latest noble kernel or recent
upstream kernel, thanks.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-signed-hwe-6.5 in Ubuntu.
https://bugs.launchpad.net/bugs/2061091

Title:
  Freezing user space processes failed after 20.008 seconds (1 tasks
  refusing to freeze, wq_busy=0)

Status in linux-signed-hwe-6.5 package in Ubuntu:
  New

Bug description:
  Sometimes, when trying to shut down or suspend my laptop, it gets
  stuck on the console screen. If I was suspending, it eventually gives
  up and goes back to the X session. During a shutdown it hangs forever
  and the only solution seems to be to force a reboot with Magic-SysRq.

  The following appears in `kern.log`:

  ```
  Apr 12 08:59:54 xps9320 kernel: 

[Kernel-packages] [Bug 2061091] Re: Freezing user space processes failed after 20.008 seconds (1 tasks refusing to freeze, wq_busy=0)

2024-04-16 Thread GuoqingJiang
And this call trace looks is related to current issue, I guess
snd_power_ref_and_wait was waiting for snd_card_disconnect which wakes
up power_sleep at the end of the code, but snd_card_disconnect ->
snd_device_disconnect_all -> snd_pcm_dev_disconnect was blocked for some
reason.

Apr 15 21:48:23 xps9320 kernel: [43261.751897] INFO: task kworker/0:6:34416 
blocked for more than 120 seconds.
Apr 15 21:48:23 xps9320 kernel: [43261.751900]   Tainted: G   OE
  6.5.0-27-generic #28~22.04.1-Ubuntu
Apr 15 21:48:23 xps9320 kernel: [43261.751901] "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 15 21:48:23 xps9320 kernel: [43261.751903] task:kworker/0:6 state:D 
stack:0 pid:34416 ppid:2  flags:0x4000
Apr 15 21:48:23 xps9320 kernel: [43261.751907] Workqueue: usb_hub_wq hub_event
Apr 15 21:48:23 xps9320 kernel: [43261.751913] Call Trace:
Apr 15 21:48:23 xps9320 kernel: [43261.751915]  
Apr 15 21:48:23 xps9320 kernel: [43261.751916]  __schedule+0x2cb/0x750
Apr 15 21:48:23 xps9320 kernel: [43261.751921]  schedule+0x63/0x110
Apr 15 21:48:23 xps9320 kernel: [43261.751924]  
schedule_preempt_disabled+0x15/0x30
Apr 15 21:48:23 xps9320 kernel: [43261.751928]  
rwsem_down_write_slowpath+0x2a2/0x550
Apr 15 21:48:23 xps9320 kernel: [43261.751940]  down_write+0x5c/0x80
Apr 15 21:48:23 xps9320 kernel: [43261.751945]  
snd_pcm_dev_disconnect+0x1d2/0x280 [snd_pcm]
Apr 15 21:48:23 xps9320 kernel: [43261.751964]  
snd_device_disconnect_all+0x47/0xa0 [snd]
Apr 15 21:48:23 xps9320 kernel: [43261.751979]  
snd_card_disconnect.part.0+0x10d/0x290 [snd]
Apr 15 21:48:23 xps9320 kernel: [43261.752019]  ? rpm_idle+0x25/0x2b0
Apr 15 21:48:23 xps9320 kernel: [43261.752023]  snd_card_disconnect+0x13/0x30 
[snd]
Apr 15 21:48:23 xps9320 kernel: [43261.752039]  
usb_audio_disconnect+0x114/0x2c0 [snd_usb_audio]
Apr 15 21:48:23 xps9320 kernel: [43261.752064]  usb_unbind_interface+0x8e/0x280
Apr 15 21:48:23 xps9320 kernel: [43261.752069]  device_remove+0x65/0x80
Apr 15 21:48:23 xps9320 kernel: [43261.752072]  
device_release_driver_internal+0x20b/0x270
Apr 15 21:48:23 xps9320 kernel: [43261.752077]  device_release_driver+0x12/0x20
Apr 15 21:48:23 xps9320 kernel: [43261.752080]  bus_remove_device+0xcb/0x140
Apr 15 21:48:23 xps9320 kernel: [43261.752083]  device_del+0x161/0x3e0
Apr 15 21:48:23 xps9320 kernel: [43261.752086]  ? kobject_put+0x67/0xa0
Apr 15 21:48:23 xps9320 kernel: [43261.752090]  usb_disable_device+0xd5/0x280
Apr 15 21:48:23 xps9320 kernel: [43261.752093]  usb_disconnect+0xe9/0x2e0
Apr 15 21:48:23 xps9320 kernel: [43261.752097]  usb_disconnect+0xcd/0x2e0
Apr 15 21:48:23 xps9320 kernel: [43261.752100]  usb_disconnect+0xcd/0x2e0
Apr 15 21:48:23 xps9320 kernel: [43261.752102]  ? usb_control_msg+0x106/0x160
Apr 15 21:48:23 xps9320 kernel: [43261.752105]  hub_port_connect+0x90/0xc30
Apr 15 21:48:23 xps9320 kernel: [43261.752109]  
hub_port_connect_change+0x91/0x300
Apr 15 21:48:23 xps9320 kernel: [43261.752113]  port_event+0x652/0x810
Apr 15 21:48:23 xps9320 kernel: [43261.752117]  hub_event+0x155/0x450
Apr 15 21:48:23 xps9320 kernel: [43261.752120]  process_one_work+0x23d/0x450
Apr 15 21:48:23 xps9320 kernel: [43261.752125]  worker_thread+0x50/0x3f0
Apr 15 21:48:23 xps9320 kernel: [43261.752128]  ? __pfx_worker_thread+0x10/0x10
Apr 15 21:48:23 xps9320 kernel: [43261.752131]  kthread+0xef/0x120
Apr 15 21:48:23 xps9320 kernel: [43261.752135]  ? __pfx_kthread+0x10/0x10
Apr 15 21:48:23 xps9320 kernel: [43261.752139]  ret_from_fork+0x44/0x70
Apr 15 21:48:23 xps9320 kernel: [43261.752143]  ? __pfx_kthread+0x10/0x10
Apr 15 21:48:23 xps9320 kernel: [43261.752147]  ret_from_fork_asm+0x1b/0x30
Apr 15 21:48:23 xps9320 kernel: [43261.752151]  

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-signed-hwe-6.5 in Ubuntu.
https://bugs.launchpad.net/bugs/2061091

Title:
  Freezing user space processes failed after 20.008 seconds (1 tasks
  refusing to freeze, wq_busy=0)

Status in linux-signed-hwe-6.5 package in Ubuntu:
  New

Bug description:
  Sometimes, when trying to shut down or suspend my laptop, it gets
  stuck on the console screen. If I was suspending, it eventually gives
  up and goes back to the X session. During a shutdown it hangs forever
  and the only solution seems to be to force a reboot with Magic-SysRq.

  The following appears in `kern.log`:

  ```
  Apr 12 08:59:54 xps9320 kernel: [173172.510341] Freezing user space processes 
failed after 20.008 seconds (1 tasks refusing to freeze, wq_busy=0):
  Apr 12 08:59:54 xps9320 kernel: [173172.515669] task:wireplumber state:D 
stack:0 pid:2408  ppid:2398   flags:0x0006
  Apr 12 08:59:54 xps9320 kernel: [173172.518923] Call Trace:
  Apr 12 08:59:54 xps9320 kernel: [173172.521755]  
  Apr 12 08:59:54 xps9320 kernel: [173172.524099]  __schedule+0x2cb/0x750
  Apr 12 08:59:54 xps9320 kernel: [173172.526333]  schedule+0x63/0x110
  Apr 12 08:59:54 xps9320 kernel: 

[Kernel-packages] [Bug 2061091] Re: Freezing user space processes failed after 20.008 seconds (1 tasks refusing to freeze, wq_busy=0)

2024-04-16 Thread GuoqingJiang
There is another calltrace in kern.log which might be another issue
(probably need a separated bug report).

Apr 15 21:45:23 xps9320 kernel: [43079.109011] 

Apr 15 21:45:23 xps9320 kernel: [43079.112281] UBSAN: shift-out-of-bounds in 
/build/linux-hwe-6.5-jkqeMi/linux-hwe-6.5-6.5.0/drivers/gpu/drm/display/drm_dp_mst_topology.c:4416:36
Apr 15 21:45:23 xps9320 kernel: [43079.115601] shift exponent -1 is negative
Apr 15 21:45:23 xps9320 kernel: [43079.119073] CPU: 0 PID: 34404 Comm: 
kworker/0:3 Tainted: G   OE  6.5.0-27-generic #28~22.04.1-Ubuntu
Apr 15 21:45:23 xps9320 kernel: [43079.122523] Hardware name: Dell Inc. XPS 
9320/0CW9KM, BIOS 2.8.0 11/13/2023
Apr 15 21:45:23 xps9320 kernel: [43079.126024] Workqueue: events 
output_poll_execute [drm_kms_helper]
Apr 15 21:45:23 xps9320 kernel: [43079.129530] Call Trace:
Apr 15 21:45:23 xps9320 kernel: [43079.132985]  
Apr 15 21:45:23 xps9320 kernel: [43079.136412]  dump_stack_lvl+0x48/0x70
Apr 15 21:45:23 xps9320 kernel: [43079.139805]  dump_stack+0x10/0x20
Apr 15 21:45:23 xps9320 kernel: [43079.143170]  
__ubsan_handle_shift_out_of_bounds+0x1ac/0x360
Apr 15 21:45:23 xps9320 kernel: [43079.146525]  
drm_dp_atomic_release_time_slots.cold+0x17/0x3d [drm_display_helper]
Apr 15 21:45:23 xps9320 kernel: [43079.149904]  
intel_dp_mst_atomic_check+0xaa/0x180 [i915]
Apr 15 21:45:23 xps9320 kernel: [43079.153378]  ? 
update_connector_routing+0x2fc/0x3f0 [drm_kms_helper]
Apr 15 21:45:23 xps9320 kernel: [43079.156738]  
drm_atomic_helper_check_modeset+0x300/0x610 [drm_kms_helper]
Apr 15 21:45:23 xps9320 kernel: [43079.160084]  intel_atomic_check+0xfe/0xb80 
[i915]
Apr 15 21:45:23 xps9320 kernel: [43079.163518]  ? 
drm_plane_check_pixel_format+0x53/0xe0 [drm]
Apr 15 21:45:23 xps9320 kernel: [43079.166858]  
drm_atomic_check_only+0x1ac/0x400 [drm]
Apr 15 21:45:23 xps9320 kernel: [43079.170172]  ? 
update_output_state+0x184/0x1a0 [drm]
Apr 15 21:45:23 xps9320 kernel: [43079.173478]  drm_atomic_commit+0x58/0xd0 
[drm]
Apr 15 21:45:23 xps9320 kernel: [43079.176765]  ? 
__pfx___drm_printfn_info+0x10/0x10 [drm]
Apr 15 21:45:23 xps9320 kernel: [43079.180049]  
drm_client_modeset_commit_atomic+0x203/0x240 [drm]
Apr 15 21:45:23 xps9320 kernel: [43079.183332]  
drm_client_modeset_commit_locked+0x5b/0x170 [drm]
Apr 15 21:45:23 xps9320 kernel: [43079.186586]  ? mutex_lock+0x12/0x50
Apr 15 21:45:23 xps9320 kernel: [43079.189776]  
drm_client_modeset_commit+0x26/0x50 [drm]
Apr 15 21:45:23 xps9320 kernel: [43079.192980]  
__drm_fb_helper_restore_fbdev_mode_unlocked+0xc2/0x100 [drm_kms_helper]
Apr 15 21:45:23 xps9320 kernel: [43079.196179]  
drm_fb_helper_hotplug_event+0x10b/0x120 [drm_kms_helper]
Apr 15 21:45:23 xps9320 kernel: [43079.199336]  
intel_fbdev_output_poll_changed+0x6b/0xb0 [i915]
Apr 15 21:45:23 xps9320 kernel: [43079.202568]  output_poll_execute+0x237/0x280 
[drm_kms_helper]
Apr 15 21:45:23 xps9320 kernel: [43079.205686]  ? __schedule+0x2d3/0x750
Apr 15 21:45:23 xps9320 kernel: [43079.208777]  process_one_work+0x23d/0x450
Apr 15 21:45:23 xps9320 kernel: [43079.211856]  worker_thread+0x50/0x3f0
Apr 15 21:45:23 xps9320 kernel: [43079.214921]  ? __pfx_worker_thread+0x10/0x10
Apr 15 21:45:23 xps9320 kernel: [43079.217984]  kthread+0xef/0x120
Apr 15 21:45:23 xps9320 kernel: [43079.221037]  ? __pfx_kthread+0x10/0x10
Apr 15 21:45:23 xps9320 kernel: [43079.224072]  ret_from_fork+0x44/0x70
Apr 15 21:45:23 xps9320 kernel: [43079.227100]  ? __pfx_kthread+0x10/0x10
Apr 15 21:45:23 xps9320 kernel: [43079.230123]  ret_from_fork_asm+0x1b/0x30
Apr 15 21:45:23 xps9320 kernel: [43079.233124]  
Apr 15 21:45:23 xps9320 kernel: [43079.235686] 


-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-signed-hwe-6.5 in Ubuntu.
https://bugs.launchpad.net/bugs/2061091

Title:
  Freezing user space processes failed after 20.008 seconds (1 tasks
  refusing to freeze, wq_busy=0)

Status in linux-signed-hwe-6.5 package in Ubuntu:
  New

Bug description:
  Sometimes, when trying to shut down or suspend my laptop, it gets
  stuck on the console screen. If I was suspending, it eventually gives
  up and goes back to the X session. During a shutdown it hangs forever
  and the only solution seems to be to force a reboot with Magic-SysRq.

  The following appears in `kern.log`:

  ```
  Apr 12 08:59:54 xps9320 kernel: [173172.510341] Freezing user space processes 
failed after 20.008 seconds (1 tasks refusing to freeze, wq_busy=0):
  Apr 12 08:59:54 xps9320 kernel: [173172.515669] task:wireplumber state:D 
stack:0 pid:2408  ppid:2398   flags:0x0006
  Apr 12 08:59:54 xps9320 kernel: [173172.518923] Call Trace:
  Apr 12 08:59:54 xps9320 kernel: [173172.521755]  
  Apr 12 08:59:54 xps9320 kernel: [173172.524099]  __schedule+0x2cb/0x750
  Apr 12 08:59:54 xps9320 kernel: 

[Kernel-packages] [Bug 2061091] Re: Freezing user space processes failed after 20.008 seconds (1 tasks refusing to freeze, wq_busy=0)

2024-04-16 Thread GuoqingJiang
Could you share which audio driver is used in the affected system by
"cat /proc/asound/modules"? And pls attach the full kern.log to check if
something is suspicious here.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-signed-hwe-6.5 in Ubuntu.
https://bugs.launchpad.net/bugs/2061091

Title:
  Freezing user space processes failed after 20.008 seconds (1 tasks
  refusing to freeze, wq_busy=0)

Status in linux-signed-hwe-6.5 package in Ubuntu:
  New

Bug description:
  Sometimes, when trying to shut down or suspend my laptop, it gets
  stuck on the console screen. If I was suspending, it eventually gives
  up and goes back to the X session. During a shutdown it hangs forever
  and the only solution seems to be to force a reboot with Magic-SysRq.

  The following appears in `kern.log`:

  ```
  Apr 12 08:59:54 xps9320 kernel: [173172.510341] Freezing user space processes 
failed after 20.008 seconds (1 tasks refusing to freeze, wq_busy=0):
  Apr 12 08:59:54 xps9320 kernel: [173172.515669] task:wireplumber state:D 
stack:0 pid:2408  ppid:2398   flags:0x0006
  Apr 12 08:59:54 xps9320 kernel: [173172.518923] Call Trace:
  Apr 12 08:59:54 xps9320 kernel: [173172.521755]  
  Apr 12 08:59:54 xps9320 kernel: [173172.524099]  __schedule+0x2cb/0x750
  Apr 12 08:59:54 xps9320 kernel: [173172.526333]  schedule+0x63/0x110
  Apr 12 08:59:54 xps9320 kernel: [173172.528585]  
snd_power_ref_and_wait+0xe5/0x140 [snd]
  Apr 12 08:59:54 xps9320 kernel: [173172.530825]  ? 
__pfx_autoremove_wake_function+0x10/0x10
  Apr 12 08:59:54 xps9320 kernel: [173172.533103]  snd_ctl_elem_info+0x4f/0x1b0 
[snd]
  Apr 12 08:59:54 xps9320 kernel: [173172.535354]  
snd_ctl_elem_info_user+0x59/0xc0 [snd]
  Apr 12 08:59:54 xps9320 kernel: [173172.537598]  snd_ctl_ioctl+0x1d4/0x650 
[snd]
  Apr 12 08:59:54 xps9320 kernel: [173172.539846]  ? __fget_light+0xa5/0x120
  Apr 12 08:59:54 xps9320 kernel: [173172.542082]  __x64_sys_ioctl+0xa0/0xf0
  Apr 12 08:59:54 xps9320 kernel: [173172.544334]  do_syscall_64+0x58/0x90
  Apr 12 08:59:54 xps9320 kernel: [173172.546566]  ? 
syscall_exit_to_user_mode+0x37/0x60
  Apr 12 08:59:54 xps9320 kernel: [173172.548838]  ? do_syscall_64+0x67/0x90
  Apr 12 08:59:54 xps9320 kernel: [173172.551085]  ? do_syscall_64+0x67/0x90
  Apr 12 08:59:54 xps9320 kernel: [173172.553342]  ? 
syscall_exit_to_user_mode+0x37/0x60
  Apr 12 08:59:54 xps9320 kernel: [173172.96]  ? do_syscall_64+0x67/0x90
  Apr 12 08:59:54 xps9320 kernel: [173172.557828]  ? common_interrupt+0x54/0xb0
  Apr 12 08:59:54 xps9320 kernel: [173172.560075]  
entry_SYSCALL_64_after_hwframe+0x6e/0xd8
  Apr 12 08:59:54 xps9320 kernel: [173172.562321] RIP: 0033:0x70f3adb1a94f
  Apr 12 08:59:54 xps9320 kernel: [173172.564650] RSP: 002b:7ffef2072940 
EFLAGS: 0246 ORIG_RAX: 0010
  Apr 12 08:59:54 xps9320 kernel: [173172.566925] RAX: ffda RBX: 
7ffef20729b0 RCX: 70f3adb1a94f
  Apr 12 08:59:54 xps9320 kernel: [173172.569222] RDX: 7ffef20729b0 RSI: 
c1105511 RDI: 0022
  Apr 12 08:59:54 xps9320 kernel: [173172.571501] RBP: 7ffef2072b90 R08: 
65107d2b6070 R09: 0004
  Apr 12 08:59:54 xps9320 kernel: [173172.573762] R10: f014 R11: 
0246 R12: 65107d2ee100
  Apr 12 08:59:54 xps9320 kernel: [173172.576047] R13: 65107d1fb400 R14: 
7ffef2072b30 R15: 7ffef2072ad0
  Apr 12 08:59:54 xps9320 kernel: [173172.578308]  
  ```

  
  Another point of interest is that, when in this situation:
   * `lsusb` hangs after printing out a few lines
   * Outgoing SSH connections hang unless I clear SSH_AUTH_SOCK. gpg-agent 
appears to be trying to check for smartcards.

  So I guess there is some sort of deadlock in the USB subsystem which
  is causing all the other problems?

  ProblemType: Bug
  DistroRelease: Ubuntu 22.04
  Package: linux-image-6.5.0-27-generic 6.5.0-27.28~22.04.1
  ProcVersionSignature: Ubuntu 6.5.0-27.28~22.04.1-generic 6.5.13
  Uname: Linux 6.5.0-27-generic x86_64
  ApportVersion: 2.20.11-0ubuntu82.5
  Architecture: amd64
  CasperMD5CheckResult: pass
  CurrentDesktop: ubuntu:GNOME
  Date: Fri Apr 12 09:42:52 2024
  InstallationDate: Installed on 2022-06-28 (653 days ago)
  InstallationMedia: Xubuntu 22.04 LTS "Jammy Jellyfish" - Release amd64 
(20220419)
  SourcePackage: linux-signed-hwe-6.5
  UpgradeStatus: No upgrade log present (probably fresh install)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-signed-hwe-6.5/+bug/2061091/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2042363] Re: AIX 7.3 NFS client frequently returns an EIO error to an application when reading or writing to a file that has been locked with fcntl() on a Ubuntu 20.04 NFSV4 ser

2024-04-15 Thread GuoqingJiang
Per below from the trace file

Nov 30 11:13:40 duckseason kernel: [1291756.354728] nfsd_dispatch: vers 4 proc 1
Nov 30 11:13:40 duckseason kernel: [1291756.354731] svc: server 
7c7e7536, pool 0, transport 3fd86d34, inuse=3
Nov 30 11:13:40 duckseason kernel: [1291756.354732] 
process_renew(6554b87b/4ab45507): starting
Nov 30 11:13:40 duckseason kernel: [1291756.354734] svc: tcp_recv 
3fd86d34 data 1 conn 0 close 0
Nov 30 11:13:40 duckseason kernel: [1291756.354736] svc: socket 
3fd86d34 recvfrom(03fecffb, 4) = -11
Nov 30 11:13:40 duckseason kernel: [1291756.354737] RPC: TCP recv_record got -11
Nov 30 11:13:40 duckseason kernel: [1291756.354737] RPC: TCP recvfrom got EAGAIN

we can see NFS server return -11 (EAGAIN), which can be executed from
from the path,

svc_recv -> svc_handle_xprt
-> xprt->xpt_ops->xpo_recvfrom
   svc_tcp_recvfrom
   -> svc_recvfrom
  -> sock_recvmsg which probably triggers sock_recvmsg_nosec -> 
... -> tcp_recvmsg

As mentioned in recvfrom manpage,

ERRORS
   The recvfrom() function shall fail if:
   EAGAIN or EWOULDBLOCK
  The socket's file descriptor is marked O_NONBLOCK and no data is
  waiting  to  be  received;  or MSG_OOB is set and no out-of-band
  data is available and either the  socket's  file  descriptor  is
  marked  O_NONBLOCK  or  the  socket does not support blocking to
  await out-of-band data.

I am not sure if 7.3 NFS client opened non-blocking socket and no data on that 
socket to be read. 
So I would like to check if 7.3 client sent something different compared with 
7.2 client which caused server returned BAD_SEQID to AIX 7.3 client.

Please also collect relevant trace log from server side when connecting
with 7.2 client, then we can investigate the difference between good one
and bad one.

If possible, maybe you can try with the latest 5.4 stable (5.4.274) and
upstream version (6.9-rc4).

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2042363

Title:
  AIX 7.3 NFS client frequently returns an EIO error to an application
  when reading or writing to a file that has been locked with fcntl() on
  a Ubuntu 20.04 NFSV4 server

Status in linux package in Ubuntu:
  New

Bug description:
  ---Problem Description---
  AIX 7.3 NFS client frequently returns an EIO error to an application when 
reading or writing to a file that has been locked with fcntl(). NFS server is 
Ubuntu 20.04.6 LTS, GNU/Linux 5.4.0-139-generic x86_64. The problem does not 
appear to affect other combinations of NFS client (including AIX 7.2) with this 
NFS server.

  The AIX team have indicated that the cause of the EIO is triggered by the NFS 
server returning a BAD_SEQID error which leads to the AIX NFS client 
incorrectly zeroing the stateid, which then leads to the NFS server returning a 
BAD_STATEID error and the NFS client then returns the EIO error. The AIX team 
would like to understand why the BAD_SEQID has been returned.
   
  ---uname output---
  Linux duckseason 5.4.0-156-generic #173-Ubuntu SMP Tue Jul 11 07:25:22 UTC 
2023 x86_64 x86_64 x86_64 GNU/Linux
   
  Machine Type = VMware ESXi Server 7.0 4 x Intel(R) Xeon(R) Gold 6348H CPU @ 
2.30GHz  

  ---Steps to Reproduce---
   We cannot offer a simple way to recreate the problem as it involves IBM MQ 
running on two primary machines (AIX) using the Ubuntu server for it's HA NFSv4 
storage.

  However, we can provide any requested trace or dumps from any or all
  of the involved machines.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2042363/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2058750] Re: i915 GPU HANG: ecode 12:1:db96edba

2024-04-14 Thread GuoqingJiang
Looks the log from #comment3 is similar as this link.

https://www.reddit.com/r/archlinux/comments/14zifl8/gpu_hang_issue_on_laptop_had_to_hard_shutdown/


Could you try with latest firmware from upstream? The latest version is 70.20.0 
per the log.

https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-
firmware.git/log/i915/adlp_guc_70.bin

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-signed-hwe-6.5 in Ubuntu.
https://bugs.launchpad.net/bugs/2058750

Title:
  i915 GPU HANG: ecode 12:1:db96edba

Status in linux-signed-hwe-6.5 package in Ubuntu:
  New

Bug description:
  This has happened several times over a period of months when visiting
  the site https://www.windy.com and viewing satellite images. The
  computer completely freezes and must be hard restarted. I don't think
  I have triggered this freeze by visiting any other site.

  ProblemType: Bug
  DistroRelease: Ubuntu 22.04
  Package: linux-image-6.5.0-26-generic 6.5.0-26.26~22.04.1
  ProcVersionSignature: Ubuntu 6.5.0-26.26~22.04.1-generic 6.5.13
  Uname: Linux 6.5.0-26-generic x86_64
  ApportVersion: 2.20.11-0ubuntu82.5
  Architecture: amd64
  CasperMD5CheckResult: pass
  CurrentDesktop: ubuntu:GNOME
  Date: Fri Mar 22 14:14:41 2024
  InstallationDate: Installed on 2023-04-12 (344 days ago)
  InstallationMedia: Ubuntu 22.04.2 LTS "Jammy Jellyfish" - Release amd64 
(20230223)
  SourcePackage: linux-signed-hwe-6.5
  UpgradeStatus: No upgrade log present (probably fresh install)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-signed-hwe-6.5/+bug/2058750/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2060217] Re: NFSv4 fails to mount in noble/s390x

2024-04-11 Thread GuoqingJiang
Thanks Vasily. After it is merged by upstream maintainer, maybe you can
send it to ubuntu kernel list as well, or wait until noble update to
future upstream stable release since the patch has been cced to
sta...@vger.kernel.org and also has fixes tag.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2060217

Title:
  NFSv4 fails to mount in noble/s390x

Status in Ubuntu on IBM z Systems:
  New
Status in linux package in Ubuntu:
  New
Status in nfs-utils package in Ubuntu:
  Triaged

Bug description:
  https://autopkgtest.ubuntu.com/packages/n/nfs-utils/noble/s390x

  Looks like it has been failing for a long time already.

  Log: https://autopkgtest.ubuntu.com/results/autopkgtest-
  noble/noble/s390x/n/nfs-utils/20240404_145924_ef255@/log.gz

  339s autopkgtest [14:41:04]: test local-server-client: 
[---
  340s Killed
  340s autopkgtest [14:41:05]: test process requested reboot with marker boot1
  364s autopkgtest-virt-ssh: WARNING: ssh connection failed. Retrying in 3 
seconds...
  372s FAIL: nfs_home not mounted
  373s autopkgtest [14:41:38]: test local-server-client: 
---]
  373s local-server-client  FAIL non-zero exit status 1

  and

  934s autopkgtest [14:50:59]: test kerberos-mount: [---
  935s Initializing database '/var/lib/krb5kdc/principal' for realm 'DEP8',
  935s master key name 'K/M@DEP8'
  935s Authenticating as principal root/admin@DEP8 with password.
  935s Principal "nfs/nfs-server.dep8@DEP8" created.
  935s Authenticating as principal root/admin@DEP8 with password.
  935s Entry for principal nfs/nfs-server.dep8 with kvno 2, encryption type 
aes256-cts-hmac-sha1-96 added to keytab FILE:/etc/krb5.keytab.
  935s Entry for principal nfs/nfs-server.dep8 with kvno 2, encryption type 
aes128-cts-hmac-sha1-96 added to keytab FILE:/etc/krb5.keytab.
  935s Authenticating as principal root/admin@DEP8 with password.
  935s Principal "host/nfs-server.dep8@DEP8" created.
  935s Authenticating as principal root/admin@DEP8 with password.
  935s Entry for principal host/nfs-server.dep8 with kvno 2, encryption type 
aes256-cts-hmac-sha1-96 added to keytab FILE:/etc/krb5.keytab.
  935s Entry for principal host/nfs-server.dep8 with kvno 2, encryption type 
aes128-cts-hmac-sha1-96 added to keytab FILE:/etc/krb5.keytab.
  936s exporting *:/storage
  938s mount.nfs: mount system call failed for /mnt
  938s umount: /mnt: not mounted.
  938s autopkgtest [14:51:02]: test kerberos-mount: ---]
  939s kerberos-mount   FAIL non-zero exit status 32

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-z-systems/+bug/2060217/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2060217] Re: NFSv4 fails to mount in noble/s390x

2024-04-10 Thread GuoqingJiang
Thanks for the info, and I tested v6.6 which was fine. Will dig into
about the change.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2060217

Title:
  NFSv4 fails to mount in noble/s390x

Status in Ubuntu on IBM z Systems:
  New
Status in linux package in Ubuntu:
  New
Status in nfs-utils package in Ubuntu:
  Triaged

Bug description:
  https://autopkgtest.ubuntu.com/packages/n/nfs-utils/noble/s390x

  Looks like it has been failing for a long time already.

  Log: https://autopkgtest.ubuntu.com/results/autopkgtest-
  noble/noble/s390x/n/nfs-utils/20240404_145924_ef255@/log.gz

  339s autopkgtest [14:41:04]: test local-server-client: 
[---
  340s Killed
  340s autopkgtest [14:41:05]: test process requested reboot with marker boot1
  364s autopkgtest-virt-ssh: WARNING: ssh connection failed. Retrying in 3 
seconds...
  372s FAIL: nfs_home not mounted
  373s autopkgtest [14:41:38]: test local-server-client: 
---]
  373s local-server-client  FAIL non-zero exit status 1

  and

  934s autopkgtest [14:50:59]: test kerberos-mount: [---
  935s Initializing database '/var/lib/krb5kdc/principal' for realm 'DEP8',
  935s master key name 'K/M@DEP8'
  935s Authenticating as principal root/admin@DEP8 with password.
  935s Principal "nfs/nfs-server.dep8@DEP8" created.
  935s Authenticating as principal root/admin@DEP8 with password.
  935s Entry for principal nfs/nfs-server.dep8 with kvno 2, encryption type 
aes256-cts-hmac-sha1-96 added to keytab FILE:/etc/krb5.keytab.
  935s Entry for principal nfs/nfs-server.dep8 with kvno 2, encryption type 
aes128-cts-hmac-sha1-96 added to keytab FILE:/etc/krb5.keytab.
  935s Authenticating as principal root/admin@DEP8 with password.
  935s Principal "host/nfs-server.dep8@DEP8" created.
  935s Authenticating as principal root/admin@DEP8 with password.
  935s Entry for principal host/nfs-server.dep8 with kvno 2, encryption type 
aes256-cts-hmac-sha1-96 added to keytab FILE:/etc/krb5.keytab.
  935s Entry for principal host/nfs-server.dep8 with kvno 2, encryption type 
aes128-cts-hmac-sha1-96 added to keytab FILE:/etc/krb5.keytab.
  936s exporting *:/storage
  938s mount.nfs: mount system call failed for /mnt
  938s umount: /mnt: not mounted.
  938s autopkgtest [14:51:02]: test kerberos-mount: ---]
  939s kerberos-mount   FAIL non-zero exit status 32

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-z-systems/+bug/2060217/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2057734] Re: proc_sched_rt01 from ubuntu_ltp failed

2024-04-10 Thread GuoqingJiang
jammy will include them after update to v5.15.150

** Changed in: linux (Ubuntu Jammy)
   Status: New => Won't Fix

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2057734

Title:
  proc_sched_rt01 from ubuntu_ltp failed

Status in ubuntu-kernel-tests:
  New
Status in linux package in Ubuntu:
  Invalid
Status in linux source package in Focal:
  Confirmed
Status in linux source package in Jammy:
  Confirmed
Status in linux source package in Mantic:
  In Progress

Bug description:
  This is a new test case, issue found on M/J/F/B when testing LTP
  update 20240312

  Test log:
  INFO: Test start time: Tue Mar 12 11:52:21 UTC 2024
  COMMAND:/opt/ltp/bin/ltp-pan -q  -e -S   -a 163430 -n 163430  -f 
/tmp/ltp-X3Nz2HWCQe/alltests -l /dev/null  -C /dev/null -T /dev/null
  LOG File: /dev/null
  FAILED COMMAND File: /dev/null
  TCONF COMMAND File: /dev/null
  Running tests...
  tst_kconfig.c:87: TINFO: Parsing kernel config 
'/lib/modules/6.5.0-27-generic/build/.config'
  tst_test.c:1741: TINFO: LTP version: 20230929-406-gcbc2d0568
  tst_test.c:1625: TINFO: Timeout per run is 0h 00m 30s
  proc_sched_rt01.c:45: TFAIL: Expect: timeslice_ms > 0 after reset to default
  proc_sched_rt01.c:51: TPASS: echo 0 > /proc/sys/kernel/sched_rt_period_us : 
EINVAL (22)
  proc_sched_rt01.c:53: TFAIL: echo -1 > /proc/sys/kernel/sched_rt_period_us 
invalid retval 2: SUCCESS (0)
  proc_sched_rt01.c:59: TPASS: echo -2 > /proc/sys/kernel/sched_rt_runtime_us : 
EINVAL (22)
  proc_sched_rt01.c:72: TFAIL: echo rt_period_us+1 > 
/proc/sys/kernel/sched_rt_runtime_us invalid retval 1: SUCCESS (0)

  HINT: You _MAY_ be missing kernel fixes:

  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c1fc6484e1fb
  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=079be8fc6309

  Summary:
  passed   2
  failed   3
  broken   0
  skipped  0
  warnings 0
  INFO: ltp-pan reported some tests FAIL
  LTP Version: 20230929-406-gcbc2d0568
  INFO: Test end time: Tue Mar 12 11:52:21 UTC 2024

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/2057734/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2057734] Re: proc_sched_rt01 from ubuntu_ltp failed

2024-04-10 Thread GuoqingJiang
focal master-next has those fix commits

** Also affects: linux (Ubuntu Focal)
   Importance: Undecided
   Status: New

** Also affects: linux (Ubuntu Jammy)
   Importance: Undecided
   Status: New

** Changed in: linux (Ubuntu Focal)
   Status: New => Won't Fix

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2057734

Title:
  proc_sched_rt01 from ubuntu_ltp failed

Status in ubuntu-kernel-tests:
  New
Status in linux package in Ubuntu:
  Invalid
Status in linux source package in Focal:
  Confirmed
Status in linux source package in Jammy:
  Confirmed
Status in linux source package in Mantic:
  In Progress

Bug description:
  This is a new test case, issue found on M/J/F/B when testing LTP
  update 20240312

  Test log:
  INFO: Test start time: Tue Mar 12 11:52:21 UTC 2024
  COMMAND:/opt/ltp/bin/ltp-pan -q  -e -S   -a 163430 -n 163430  -f 
/tmp/ltp-X3Nz2HWCQe/alltests -l /dev/null  -C /dev/null -T /dev/null
  LOG File: /dev/null
  FAILED COMMAND File: /dev/null
  TCONF COMMAND File: /dev/null
  Running tests...
  tst_kconfig.c:87: TINFO: Parsing kernel config 
'/lib/modules/6.5.0-27-generic/build/.config'
  tst_test.c:1741: TINFO: LTP version: 20230929-406-gcbc2d0568
  tst_test.c:1625: TINFO: Timeout per run is 0h 00m 30s
  proc_sched_rt01.c:45: TFAIL: Expect: timeslice_ms > 0 after reset to default
  proc_sched_rt01.c:51: TPASS: echo 0 > /proc/sys/kernel/sched_rt_period_us : 
EINVAL (22)
  proc_sched_rt01.c:53: TFAIL: echo -1 > /proc/sys/kernel/sched_rt_period_us 
invalid retval 2: SUCCESS (0)
  proc_sched_rt01.c:59: TPASS: echo -2 > /proc/sys/kernel/sched_rt_runtime_us : 
EINVAL (22)
  proc_sched_rt01.c:72: TFAIL: echo rt_period_us+1 > 
/proc/sys/kernel/sched_rt_runtime_us invalid retval 1: SUCCESS (0)

  HINT: You _MAY_ be missing kernel fixes:

  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c1fc6484e1fb
  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=079be8fc6309

  Summary:
  passed   2
  failed   3
  broken   0
  skipped  0
  warnings 0
  INFO: ltp-pan reported some tests FAIL
  LTP Version: 20230929-406-gcbc2d0568
  INFO: Test end time: Tue Mar 12 11:52:21 UTC 2024

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/2057734/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2057734] Re: proc_sched_rt01 from ubuntu_ltp failed

2024-04-10 Thread GuoqingJiang
** Also affects: linux (Ubuntu Mantic)
   Importance: Undecided
   Status: New

** Changed in: linux (Ubuntu Mantic)
   Importance: Undecided => Medium

** Changed in: linux (Ubuntu Mantic)
   Status: New => In Progress

** Changed in: linux (Ubuntu Mantic)
 Assignee: (unassigned) => GuoqingJiang (guoqingjiang)

** Changed in: linux (Ubuntu)
   Status: New => Invalid

** Changed in: linux (Ubuntu)
Milestone: mantic-updates => noble-updates

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2057734

Title:
  proc_sched_rt01 from ubuntu_ltp failed

Status in ubuntu-kernel-tests:
  New
Status in linux package in Ubuntu:
  Invalid
Status in linux source package in Focal:
  New
Status in linux source package in Jammy:
  New
Status in linux source package in Mantic:
  In Progress

Bug description:
  This is a new test case, issue found on M/J/F/B when testing LTP
  update 20240312

  Test log:
  INFO: Test start time: Tue Mar 12 11:52:21 UTC 2024
  COMMAND:/opt/ltp/bin/ltp-pan -q  -e -S   -a 163430 -n 163430  -f 
/tmp/ltp-X3Nz2HWCQe/alltests -l /dev/null  -C /dev/null -T /dev/null
  LOG File: /dev/null
  FAILED COMMAND File: /dev/null
  TCONF COMMAND File: /dev/null
  Running tests...
  tst_kconfig.c:87: TINFO: Parsing kernel config 
'/lib/modules/6.5.0-27-generic/build/.config'
  tst_test.c:1741: TINFO: LTP version: 20230929-406-gcbc2d0568
  tst_test.c:1625: TINFO: Timeout per run is 0h 00m 30s
  proc_sched_rt01.c:45: TFAIL: Expect: timeslice_ms > 0 after reset to default
  proc_sched_rt01.c:51: TPASS: echo 0 > /proc/sys/kernel/sched_rt_period_us : 
EINVAL (22)
  proc_sched_rt01.c:53: TFAIL: echo -1 > /proc/sys/kernel/sched_rt_period_us 
invalid retval 2: SUCCESS (0)
  proc_sched_rt01.c:59: TPASS: echo -2 > /proc/sys/kernel/sched_rt_runtime_us : 
EINVAL (22)
  proc_sched_rt01.c:72: TFAIL: echo rt_period_us+1 > 
/proc/sys/kernel/sched_rt_runtime_us invalid retval 1: SUCCESS (0)

  HINT: You _MAY_ be missing kernel fixes:

  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c1fc6484e1fb
  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=079be8fc6309

  Summary:
  passed   2
  failed   3
  broken   0
  skipped  0
  warnings 0
  INFO: ltp-pan reported some tests FAIL
  LTP Version: 20230929-406-gcbc2d0568
  INFO: Test end time: Tue Mar 12 11:52:21 UTC 2024

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/2057734/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2057734] Re: proc_sched_rt01 from ubuntu_ltp failed

2024-04-10 Thread GuoqingJiang
Given ESM kernel only accepts CVE patches, will only send patches
against Mantic

[Impact]
The updated LTP has added proc_sched_rt01 testcase which can't pass
since several commits are missed from kernel side.

Test log:
INFO: Test start time: Tue Mar 12 11:52:21 UTC 2024
COMMAND: /opt/ltp/bin/ltp-pan -q -e -S -a 163430 -n 163430 -f 
/tmp/ltp-X3Nz2HWCQe/alltests -l /dev/null -C /dev/null -T /dev/null
LOG File: /dev/null
FAILED COMMAND File: /dev/null
TCONF COMMAND File: /dev/null
Running tests...
tst_kconfig.c:87: TINFO: Parsing kernel config 
'/lib/modules/6.5.0-27-generic/build/.config'
tst_test.c:1741: TINFO: LTP version: 20230929-406-gcbc2d0568
tst_test.c:1625: TINFO: Timeout per run is 0h 00m 30s
proc_sched_rt01.c:45: TFAIL: Expect: timeslice_ms > 0 after reset to default
proc_sched_rt01.c:51: TPASS: echo 0 > /proc/sys/kernel/sched_rt_period_us : 
EINVAL (22)
proc_sched_rt01.c:53: TFAIL: echo -1 > /proc/sys/kernel/sched_rt_period_us 
invalid retval 2: SUCCESS (0)
proc_sched_rt01.c:59: TPASS: echo -2 > /proc/sys/kernel/sched_rt_runtime_us : 
EINVAL (22)
proc_sched_rt01.c:72: TFAIL: echo rt_period_us+1 > 
/proc/sys/kernel/sched_rt_runtime_us invalid retval 1: SUCCESS (0)

HINT: You _MAY_ be missing kernel fixes:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c1fc6484e1fb
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=079be8fc6309

[Fix]
There are 3 relevant commits from upstream.

1. 079be8fc6309 sched/rt: Disallow writing invalid values to sched_rt_period_us
2. c1fc6484e1fb sched/rt: sysctl_sched_rr_timeslice show default timeslice 
after reset
3. c7fcb99877f9 sched/rt: Fix sysctl_sched_rr_timeslice intial value

Mantic: the 3rd is already in master-next.
Jammy: stable v5.15.150 includes the three commits.
Focal: master-next has include them after update to v5.4.270
Bionic: all the three commits are needed.

[Test case]
Run LTP update 20240312 to check the log of proc_sched_rt01.

[Regression potential]
Low risk since these content are existed in upstream for a while.

Cyril Hrubis (2):
  sched/rt: sysctl_sched_rr_timeslice show default timeslice after reset
  sched/rt: Disallow writing invalid values to sched_rt_period_us

 kernel/sched/rt.c | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

** Also affects: linux (Ubuntu)
   Importance: Undecided
   Status: New

** Also affects: linux (Ubuntu Mantic)
   Importance: Undecided
   Status: New

** Changed in: linux (Ubuntu Mantic)
Milestone: None => mantic-updates

** No longer affects: linux (Ubuntu Mantic)

** Changed in: linux (Ubuntu)
Milestone: None => mantic-updates

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2057734

Title:
  proc_sched_rt01 from ubuntu_ltp failed

Status in ubuntu-kernel-tests:
  New
Status in linux package in Ubuntu:
  New

Bug description:
  This is a new test case, issue found on M/J/F/B when testing LTP
  update 20240312

  Test log:
  INFO: Test start time: Tue Mar 12 11:52:21 UTC 2024
  COMMAND:/opt/ltp/bin/ltp-pan -q  -e -S   -a 163430 -n 163430  -f 
/tmp/ltp-X3Nz2HWCQe/alltests -l /dev/null  -C /dev/null -T /dev/null
  LOG File: /dev/null
  FAILED COMMAND File: /dev/null
  TCONF COMMAND File: /dev/null
  Running tests...
  tst_kconfig.c:87: TINFO: Parsing kernel config 
'/lib/modules/6.5.0-27-generic/build/.config'
  tst_test.c:1741: TINFO: LTP version: 20230929-406-gcbc2d0568
  tst_test.c:1625: TINFO: Timeout per run is 0h 00m 30s
  proc_sched_rt01.c:45: TFAIL: Expect: timeslice_ms > 0 after reset to default
  proc_sched_rt01.c:51: TPASS: echo 0 > /proc/sys/kernel/sched_rt_period_us : 
EINVAL (22)
  proc_sched_rt01.c:53: TFAIL: echo -1 > /proc/sys/kernel/sched_rt_period_us 
invalid retval 2: SUCCESS (0)
  proc_sched_rt01.c:59: TPASS: echo -2 > /proc/sys/kernel/sched_rt_runtime_us : 
EINVAL (22)
  proc_sched_rt01.c:72: TFAIL: echo rt_period_us+1 > 
/proc/sys/kernel/sched_rt_runtime_us invalid retval 1: SUCCESS (0)

  HINT: You _MAY_ be missing kernel fixes:

  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c1fc6484e1fb
  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=079be8fc6309

  Summary:
  passed   2
  failed   3
  broken   0
  skipped  0
  warnings 0
  INFO: ltp-pan reported some tests FAIL
  LTP Version: 20230929-406-gcbc2d0568
  INFO: Test end time: Tue Mar 12 11:52:21 UTC 2024

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/2057734/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2060217] Re: NFSv4 fails to mount in noble/s390x

2024-04-10 Thread GuoqingJiang
So mount v4 returns 32 (probably means EPIPE) but v3 returns 0.

ubuntu@nfs:~$ sudo strace -e mount mount localhost:/home /mnt/nfs_home -o vers=4
mount.nfs: mount system call failed for /mnt/nfs_home
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=1830, si_uid=0, 
si_status=32, si_utime=0, si_stime=0} ---
+++ exited with 32 +++

ubuntu@nfs:~$ sudo strace -e mount mount localhost:/home /mnt/nfs_home -o vers=3
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=1850, si_uid=0, 
si_status=0, si_utime=0, si_stime=0} ---
+++ exited with 0 +++

Could I know the kernel version which use for below test? Seems it was tested 
on 20240302.
https://autopkgtest.ubuntu.com/results/autopkgtest-noble/noble/s390x/n/nfs-utils/20240302_020943_bf464@/log.gz

And I tried with mainline/v6.8 kernel which has the same issue.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2060217

Title:
  NFSv4 fails to mount in noble/s390x

Status in Ubuntu on IBM z Systems:
  New
Status in linux package in Ubuntu:
  New
Status in nfs-utils package in Ubuntu:
  Triaged

Bug description:
  https://autopkgtest.ubuntu.com/packages/n/nfs-utils/noble/s390x

  Looks like it has been failing for a long time already.

  Log: https://autopkgtest.ubuntu.com/results/autopkgtest-
  noble/noble/s390x/n/nfs-utils/20240404_145924_ef255@/log.gz

  339s autopkgtest [14:41:04]: test local-server-client: 
[---
  340s Killed
  340s autopkgtest [14:41:05]: test process requested reboot with marker boot1
  364s autopkgtest-virt-ssh: WARNING: ssh connection failed. Retrying in 3 
seconds...
  372s FAIL: nfs_home not mounted
  373s autopkgtest [14:41:38]: test local-server-client: 
---]
  373s local-server-client  FAIL non-zero exit status 1

  and

  934s autopkgtest [14:50:59]: test kerberos-mount: [---
  935s Initializing database '/var/lib/krb5kdc/principal' for realm 'DEP8',
  935s master key name 'K/M@DEP8'
  935s Authenticating as principal root/admin@DEP8 with password.
  935s Principal "nfs/nfs-server.dep8@DEP8" created.
  935s Authenticating as principal root/admin@DEP8 with password.
  935s Entry for principal nfs/nfs-server.dep8 with kvno 2, encryption type 
aes256-cts-hmac-sha1-96 added to keytab FILE:/etc/krb5.keytab.
  935s Entry for principal nfs/nfs-server.dep8 with kvno 2, encryption type 
aes128-cts-hmac-sha1-96 added to keytab FILE:/etc/krb5.keytab.
  935s Authenticating as principal root/admin@DEP8 with password.
  935s Principal "host/nfs-server.dep8@DEP8" created.
  935s Authenticating as principal root/admin@DEP8 with password.
  935s Entry for principal host/nfs-server.dep8 with kvno 2, encryption type 
aes256-cts-hmac-sha1-96 added to keytab FILE:/etc/krb5.keytab.
  935s Entry for principal host/nfs-server.dep8 with kvno 2, encryption type 
aes128-cts-hmac-sha1-96 added to keytab FILE:/etc/krb5.keytab.
  936s exporting *:/storage
  938s mount.nfs: mount system call failed for /mnt
  938s umount: /mnt: not mounted.
  938s autopkgtest [14:51:02]: test kerberos-mount: ---]
  939s kerberos-mount   FAIL non-zero exit status 32

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-z-systems/+bug/2060217/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2060217] Re: NFSv4 fails to mount in noble/s390x

2024-04-08 Thread GuoqingJiang
Hi Andreas, I'd like to take a look, could you please let me know how to
access your vm? Because I don't have s390 hw.

BTW, seems it is fine for both amd64 and arm64.

https://autopkgtest.ubuntu.com/packages/n/nfs-utils/noble/amd64
https://autopkgtest.ubuntu.com/packages/n/nfs-utils/noble/arm64

And from the test history of s390, nfs-utils/1:2.6.4-3ubuntu3 can pass.

https://autopkgtest.ubuntu.com/results/autopkgtest-
noble/noble/s390x/n/nfs-utils/20240302_020943_bf464@/log.gz

But nfs-utils/1:2.6.4-3ubuntu4 failed for both local-server-client and
kerberos-mount test.

https://autopkgtest.ubuntu.com/results/autopkgtest-
noble/noble/s390x/n/nfs-utils/20240331_152208_be4f4@/log.gz

I cloned nfs-utils repo though I can't find anything strange per git
log.

commit 5caa7491375e1e81012dcf1565d2e73c30f6f085 (tag: import/1%2.6.4-3ubuntu4, 
origin/ubuntu/noble)
Author: Steve Langasek 
Date:   Sun Mar 31 08:10:14 2024 +

1:2.6.4-3ubuntu4 (patches unapplied)

Imported using git-ubuntu import.

commit 86e924b7a8f68924304db261455cfff593cd5516 (tag: import/1%2.6.4-3ubuntu3)
Author: Steve Langasek 
Date:   Thu Feb 29 09:30:58 2024 +

1:2.6.4-3ubuntu3 (patches unapplied)

Imported using git-ubuntu import.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2060217

Title:
  NFSv4 fails to mount in noble/s390x

Status in linux package in Ubuntu:
  New
Status in nfs-utils package in Ubuntu:
  Triaged

Bug description:
  https://autopkgtest.ubuntu.com/packages/n/nfs-utils/noble/s390x

  Looks like it has been failing for a long time already.

  Log: https://autopkgtest.ubuntu.com/results/autopkgtest-
  noble/noble/s390x/n/nfs-utils/20240404_145924_ef255@/log.gz

  339s autopkgtest [14:41:04]: test local-server-client: 
[---
  340s Killed
  340s autopkgtest [14:41:05]: test process requested reboot with marker boot1
  364s autopkgtest-virt-ssh: WARNING: ssh connection failed. Retrying in 3 
seconds...
  372s FAIL: nfs_home not mounted
  373s autopkgtest [14:41:38]: test local-server-client: 
---]
  373s local-server-client  FAIL non-zero exit status 1

  and

  934s autopkgtest [14:50:59]: test kerberos-mount: [---
  935s Initializing database '/var/lib/krb5kdc/principal' for realm 'DEP8',
  935s master key name 'K/M@DEP8'
  935s Authenticating as principal root/admin@DEP8 with password.
  935s Principal "nfs/nfs-server.dep8@DEP8" created.
  935s Authenticating as principal root/admin@DEP8 with password.
  935s Entry for principal nfs/nfs-server.dep8 with kvno 2, encryption type 
aes256-cts-hmac-sha1-96 added to keytab FILE:/etc/krb5.keytab.
  935s Entry for principal nfs/nfs-server.dep8 with kvno 2, encryption type 
aes128-cts-hmac-sha1-96 added to keytab FILE:/etc/krb5.keytab.
  935s Authenticating as principal root/admin@DEP8 with password.
  935s Principal "host/nfs-server.dep8@DEP8" created.
  935s Authenticating as principal root/admin@DEP8 with password.
  935s Entry for principal host/nfs-server.dep8 with kvno 2, encryption type 
aes256-cts-hmac-sha1-96 added to keytab FILE:/etc/krb5.keytab.
  935s Entry for principal host/nfs-server.dep8 with kvno 2, encryption type 
aes128-cts-hmac-sha1-96 added to keytab FILE:/etc/krb5.keytab.
  936s exporting *:/storage
  938s mount.nfs: mount system call failed for /mnt
  938s umount: /mnt: not mounted.
  938s autopkgtest [14:51:02]: test kerberos-mount: ---]
  939s kerberos-mount   FAIL non-zero exit status 32

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2060217/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2058477] Re: [Ubuntu 22.04.4/linux-image-6.5.0-26-generic] Kernel output "UBSAN: array-index-out-of-bounds in /build/linux-hwe-6.5-34pCLi/linux-hwe-6.5-6.5.0/drivers/net/hyperv/

2024-04-01 Thread GuoqingJiang
[Impact]
error message "UBSAN: array-index-out-of-bounds in 
drivers/net/hyperv/netvsc.c:1446:41" appears
multiple times during boot for a Hyper-V environment.

[Fix]
Clean cherry-pick commit bb9b0e46b84 for Focal, Jammy and Mantic.

[Test case]
check the dmesg to see if there is the error message "UBSAN: 
array-index-out-of-bounds"

[Regression Potential]
DPDK which processes netvsc packets, so it might incompatible with ancient 
DPDK, but modern DPDK
had already used flexible array member.

** Also affects: linux (Ubuntu Mantic)
   Importance: Undecided
   Status: New

** Also affects: linux (Ubuntu Focal)
   Importance: Undecided
   Status: New

** Also affects: linux (Ubuntu Jammy)
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2058477

Title:
  [Ubuntu 22.04.4/linux-image-6.5.0-26-generic] Kernel output "UBSAN:
  array-index-out-of-bounds in /build/linux-hwe-6.5-34pCLi/linux-
  hwe-6.5-6.5.0/drivers/net/hyperv/netvsc.c:1445:41" multiple times,
  especially during boot.

Status in linux package in Ubuntu:
  New
Status in linux source package in Focal:
  New
Status in linux source package in Jammy:
  New
Status in linux source package in Mantic:
  New

Bug description:
  Overview:

  A newly installed Ubuntu Server 22.04.4 on a Hyper-V virtual machine
  outputs error message "UBSAN: array-index-out-of-bounds in
  /build/linux-hwe-6.5-34pCLi/linux-
  hwe-6.5-6.5.0/drivers/net/hyperv/netvsc.c:1446:41" multiple times,
  especially during boot.

  Reproducing steps:
  1. Download ubuntu-22.04.4-live-server-amd64.iso
  2. Create a Hyper-V virtual machine.
  3. Install Ubuntu 22.04.4 Server on the VM with the downloaded iso normally.
  4. Boot the machine.

  Additional Information:
  - Host machine: Windows 10 Pro 22H2 OS Build 19045.3758
  - Hyper-V configuration version: 9.0
  - The error message "UBSAN: array-index-out-of-bounds" is similar to 
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2008157, but the drivers 
are different.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2058477/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2050032] Re: mpt3sas causes kernel stack trace

2024-03-31 Thread GuoqingJiang
Maybe the series
(https://lore.kernel.org/all/20230806170604.16143-2-ja...@equiv.tech/)
is needed for linux-hwe-6.5.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-hwe-6.5 in Ubuntu.
https://bugs.launchpad.net/bugs/2050032

Title:
  mpt3sas causes kernel stack trace

Status in linux-hwe-6.5 package in Ubuntu:
  Confirmed

Bug description:
  [   22.989826] 

  [   22.989831] UBSAN: array-index-out-of-bounds in 
/build/linux-hwe-6.5-q7NZ0T/linux-hwe-6.5-6.5.0/drivers/scsi/mpt3sas/mpt3sas_scsih.c:4667:12
  [   22.989838] index 1 is out of range for type 
'MPI2_EVENT_SAS_TOPO_PHY_ENTRY [1]'
  [   22.989843] CPU: 23 PID: 0 Comm: swapper/23 Not tainted 6.5.0-14-generic 
#14~22.04.1-Ubuntu
  [   22.989850] Hardware name: Supermicro H8DG6/H8DGi/H8DG6/H8DGi, BIOS 2.0b   
03/01/2012
  [   22.989854] Call Trace:
  [   22.989858]  
  [   22.989862]  dump_stack_lvl+0x48/0x70
  [   22.989877]  dump_stack+0x10/0x20
  [   22.989883]  __ubsan_handle_out_of_bounds+0xc6/0x110
  [   22.989895]  _scsih_check_topo_delete_events+0x2dc/0x350 [mpt3sas]
  [   22.989962]  mpt3sas_scsih_event_callback+0x21f/0x630 [mpt3sas]
  [   22.990022]  _base_async_event.isra.0+0x73/0x190 [mpt3sas]
  [   22.990078]  _base_process_reply_queue+0x3a0/0x720 [mpt3sas]
  [   22.990133]  _base_interrupt+0x4e/0x70 [mpt3sas]
  [   22.990188]  __handle_irq_event_percpu+0x4f/0x1c0
  [   22.990197]  handle_irq_event+0x39/0x80
  [   22.990202]  handle_edge_irq+0x8c/0x250
  [   22.990208]  __common_interrupt+0x56/0x110
  [   22.990217]  common_interrupt+0x9f/0xb0
  [   22.990224]  
  [   22.990226]  
  [   22.990228]  asm_common_interrupt+0x27/0x40
  [   22.990239] RIP: 0010:cpuidle_idle_call+0xa2/0x190
  [   22.990248] Code: 00 4c 89 e2 4c 89 ee 48 89 df e8 c9 98 c1 00 4c 89 ee 48 
89 df 89 c2 e8 9c a7 ff ff 65 48 8b 04 25 80 28 03 00 f0 80 48 02 20 <9c> 58 0f 
1f 40 00 f6 c4 02 0f 84 8b 00 00 00 48 8b 45 d8 65 48 2b
  [   22.990253] RSP: 0018:b627443a7eb0 EFLAGS: 0202
  [   22.990259] RAX: 9edde8178000 RBX: a58e52e0 RCX: 

  [   22.990263] RDX:  RSI:  RDI: 

  [   22.990266] RBP: b627443a7ee0 R08:  R09: 

  [   22.990269] R10:  R11:  R12: 
0001
  [   22.990272] R13: 9ed9ea75cc00 R14:  R15: 

  [   22.990279]  do_idle+0x82/0xf0
  [   22.990285]  cpu_startup_entry+0x1d/0x20
  [   22.990290]  start_secondary+0x129/0x160
  [   22.990300]  secondary_startup_64_no_verify+0x17e/0x18b
  [   22.990311]  
  [   22.99031

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-hwe-6.5/+bug/2050032/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2058477] Re: [Ubuntu 22.04.4/linux-image-6.5.0-26-generic] Kernel output "UBSAN: array-index-out-of-bounds in /build/linux-hwe-6.5-34pCLi/linux-hwe-6.5-6.5.0/drivers/net/hyperv/

2024-03-31 Thread GuoqingJiang
I think it was fixed by upstream commit bb9b0e46b84c ("hv: hyperv.h:
Replace one-element array with flexible-array member"), need to double
check.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2058477

Title:
  [Ubuntu 22.04.4/linux-image-6.5.0-26-generic] Kernel output "UBSAN:
  array-index-out-of-bounds in /build/linux-hwe-6.5-34pCLi/linux-
  hwe-6.5-6.5.0/drivers/net/hyperv/netvsc.c:1445:41" multiple times,
  especially during boot.

Status in linux package in Ubuntu:
  New

Bug description:
  Overview:

  A newly installed Ubuntu Server 22.04.4 on a Hyper-V virtual machine
  outputs error message "UBSAN: array-index-out-of-bounds in
  /build/linux-hwe-6.5-34pCLi/linux-
  hwe-6.5-6.5.0/drivers/net/hyperv/netvsc.c:1446:41" multiple times,
  especially during boot.

  Reproducing steps:
  1. Download ubuntu-22.04.4-live-server-amd64.iso
  2. Create a Hyper-V virtual machine.
  3. Install Ubuntu 22.04.4 Server on the VM with the downloaded iso normally.
  4. Boot the machine.

  Additional Information:
  - Host machine: Windows 10 Pro 22H2 OS Build 19045.3758
  - Hyper-V configuration version: 9.0
  - The error message "UBSAN: array-index-out-of-bounds" is similar to 
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2008157, but the drivers 
are different.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2058477/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp