This bug is awaiting verification that the linux-mtk/5.15.0-1030.34
kernel in -proposed solves the problem. Please test the kernel and
update this bug with the results. If the problem is solved, change the
tag 'verification-needed-jammy-linux-mtk' to 'verification-done-jammy-
linux-mtk'. If the problem still exists, change the tag 'verification-
needed-jammy-linux-mtk' to 'verification-failed-jammy-linux-mtk'.


If verification is not done by 5 working days from today, this fix will
be dropped from the source code, and this bug will be closed.


See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
to enable and use -proposed. Thank you!


** Tags added: kernel-spammed-jammy-linux-mtk-v2 
verification-needed-jammy-linux-mtk

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2032176

Title:
  Crashing with CPU soft lock on GA kernel 5.15.0.79.76 and HWE kernel
  5.19.0-46.47-22.04.1

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Jammy:
  Fix Released

Bug description:
  Impact:
  We had reports of VM setups which would show intermediate crashes and after 
that locking up completely. This could be reproduced with large memory setups.
  The problem seems to be that fixes to performance regressions caused more 
problems in 5.15 kernels and the full fixes are too intrusive to be backported.

  Fix:
  The following patch was recently sent to the upstream stable mailing list and 
looks to be making its way into linux-5.15.y. This changes the default value of 
kvm.tdp_mmu to off (if anyone is willing to take the risks, this can be changed 
back in config).

  Regression potential:
  VM hosts with many large memory tennants might see a performance impact which 
the TDP MMU approach tried to solve. If those did not see other problems they 
might turn this on again.

  Testcase:
  Large openstack instance (64GB memory, AMD CPU (using SVM)) with a large 
second level guest (32GB memory). Repeatedly starting and stopping the 2nd 
level guest.

  
  --- original description ---
  The crash occurred on a juju machine, and the juju agent was lost.
  The juju machine is on an openstack instance provision by juju.

  The openstack console log indicts the it is related to spin_lock and KVM MMU:
  [418200.348830]  ? _raw_spin_lock+0x22/0x30
  [418200.349588]  _raw_write_lock+0x20/0x30
  [418200.350196]  kvm_tdp_mmu_map+0x2b1/0x490 [kvm]
  [418200.351014]  kvm_mmu_notifier_invalidate_range_start+0x1ad/0x300 [kvm]
  [418200.351796]  direct_page_fault+0x206/0x310 [kvm]
  [418200.352667]  __mmu_notifier_invalidate_range_start+0x91/0x1b0
  [418200.353624]  kvm_tdp_page_fault+0x72/0x90 [kvm]
  [418200.354496]  try_to_migrate_one+0x691/0x730
  [418200.355436]  kvm_mmu_page_fault+0x73/0x1c0 [kvm]

  openstack console log: https://pastebin.canonical.com/p/spmH8r3crQ/

  syslog: https://pastebin.canonical.com/p/wFPsFD8G9n/
  The syslog was rotated after the crash occurred, so the syslog at the time of 
the initial crash was lost.

  Other juju machine with 5.15.0.79.76 kernel seems to have the same
  issues.

  We previously have a similar issue with 5.15.0-73. The juju machine
  crashed with raw_spin_lock and kvm mmu in the logs as well:
  https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2026229

  ProblemType: Bug
  DistroRelease: Ubuntu 22.04
  Package: linux-image-5.19.0-46-generic 5.19.0-46.47~22.04.1
  ProcVersionSignature: Ubuntu 5.19.0-46.47~22.04.1-generic 5.19.17
  Uname: Linux 5.19.0-46-generic x86_64
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  ApportVersion: 2.20.11-0ubuntu82.5
  Architecture: amd64
  CasperMD5CheckResult: unknown
  CloudArchitecture: x86_64
  CloudID: openstack
  CloudName: openstack
  CloudPlatform: openstack
  CloudSubPlatform: metadata (http://169.254.169.254)
  Date: Mon Aug 21 08:59:46 2023
  Ec2AMI: ami-00000c61
  Ec2AMIManifest: FIXME
  Ec2AvailabilityZone: availability-zone-1
  Ec2InstanceType: builder-cpu4-ram72-disk20
  Ec2Kernel: unavailable
  Ec2Ramdisk: unavailable
  ProcEnviron:
   TERM=xterm-256color
   PATH=(custom, no user)
   LANG=C.UTF-8
   SHELL=/bin/bash
  SourcePackage: linux-signed-hwe-5.19
  UpgradeStatus: No upgrade log present (probably fresh install)
  ---
  ProblemType: Bug
  AlsaDevices:
   total 0
   crw-rw---- 1 root audio 116,  1 Aug 23 03:23 seq
   crw-rw---- 1 root audio 116, 33 Aug 23 03:23 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
  ApportVersion: 2.20.11-0ubuntu82.5
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  CRDA: N/A
  CasperMD5CheckResult: unknown
  CloudArchitecture: x86_64
  CloudID: openstack
  CloudName: openstack
  CloudPlatform: openstack
  CloudSubPlatform: metadata (http://169.254.169.254)
  DistroRelease: Ubuntu 22.04
  Ec2AMI: ami-00000fbb
  Ec2AMIManifest: FIXME
  Ec2AvailabilityZone: availability-zone-2
  Ec2InstanceType: builder-cpu2-ram44-disk20
  Ec2Kernel: unavailable
  Ec2Ramdisk: unavailable
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
  Lsusb: Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
  Lsusb-t: /:  Bus 01.Port 1: Dev 1, Class=root_hub, Driver=uhci_hcd/2p, 12M
  MachineType: OpenStack Foundation OpenStack Nova
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  Package: linux (not installed)
  PciMultimedia:

  ProcEnviron:
   TERM=xterm-256color
   PATH=(custom, no user)
   LANG=C.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 qxldrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.15.0-83-generic 
root=UUID=a6de04b8-3631-4ce4-bb96-48076f4a56bf ro console=tty1 console=ttyS0
  ProcVersionSignature: Ubuntu 5.15.0-83.92-generic 5.15.116
  RelatedPackageVersions:
   linux-restricted-modules-5.15.0-83-generic N/A
   linux-backports-modules-5.15.0-83-generic  N/A
   linux-firmware                             20220329.git681281e4-0ubuntu3.17
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
  Tags:  jammy ec2-images
  Uname: Linux 5.15.0-83-generic x86_64
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups: N/A
  _MarkForUpload: True
  dmi.bios.date: 04/01/2014
  dmi.bios.release: 0.0
  dmi.bios.vendor: SeaBIOS
  dmi.bios.version: 1.13.0-1ubuntu1.1
  dmi.chassis.type: 1
  dmi.chassis.vendor: QEMU
  dmi.chassis.version: pc-i440fx-4.2
  dmi.modalias: 
dmi:bvnSeaBIOS:bvr1.13.0-1ubuntu1.1:bd04/01/2014:br0.0:svnOpenStackFoundation:pnOpenStackNova:pvr21.2.4:cvnQEMU:ct1:cvrpc-i440fx-4.2:sku:
  dmi.product.family: Virtual Machine
  dmi.product.name: OpenStack Nova
  dmi.product.version: 21.2.4
  dmi.sys.vendor: OpenStack Foundation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2032176/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to