To add a bit more detail (maybe unrelated but with so little evidence
everything helps), when thos lockups happen, is the server at least
pingable? Some other idea would be, as long as those servers are
accessible enough to see whether sysrq combinations are still handled.
Though I fear at least for Stéphane that server is somewhere else with
probably only ssh (maybe ipmi) access. But if that was possible and
working, maybe one could prepare kdump and enable the sysrq crashing
combo.

Otherwise, and that again is probably only possible for Luis if his
devel servers do not need zfs, it would help to see how various mainline
kernels between 4.4 and 4.15 are doing. And in parallel have some
"canary" using the latest update. IIRC the one just released had a large
portion of upstream stable pulled in.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1799497

Title:
  4.15 kernel hard lockup about once a week

Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in Bionic:
  Incomplete

Bug description:
  My main server has been running into hard lockups about once a week
  ever since I switched to the 4.15 Ubuntu 18.04 kernel.

  When this happens, nothing is printed to the console, it's effectively
  stuck showing a login prompt. The system is running with panic=1 on
  the cmdline but isn't rebooting so the kernel isn't even processing
  this as a kernel panic.

  
  As this felt like a potential hardware issue, I had my hosting provider give 
me a completely different system, different motherboard, different CPU, 
different RAM and different storage, I installed that system on 18.04 and moved 
my data over, a week later, I hit the issue again.

  We've since also had a LXD user reporting similar symptoms here also on 
varying hardware:
    https://github.com/lxc/lxd/issues/5197

  
  My system doesn't have a lot of memory pressure with about 50% of free memory:

  root@vorash:~# free -m
                total        used        free      shared  buff/cache   
available
  Mem:          31819       17574         402         513       13842       
13292
  Swap:         15909        2687       13222

  I will now try to increase console logging as much as possible on the
  system in the hopes that next time it hangs we can get a better idea
  of what happened but I'm not too hopeful given the complete silence on
  the console when this occurs.

  System is currently on:
    Linux vorash 4.15.0-36-generic #39-Ubuntu SMP Mon Sep 24 16:19:09 UTC 2018 
x86_64 x86_64 x86_64 GNU/Linux

  But I've seen this since the GA kernel on 4.15 so it's not a recent 
regression.
  --- 
  ProblemType: Bug
  AlsaDevices:
   total 0
   crw-rw---- 1 root audio 116,  1 Oct 23 16:12 seq
   crw-rw---- 1 root audio 116, 33 Oct 23 16:12 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
  ApportVersion: 2.20.9-0ubuntu7.4
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 
'arecord'
  AudioDevicesInUse:
   Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with 
exit code 1: Cannot stat file /proc/22822/fd/10: Permission denied
   Cannot stat file /proc/22831/fd/10: Permission denied
  DistroRelease: Ubuntu 18.04
  HibernationDevice:
   RESUME=none
   CRYPTSETUP=n
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
  Lsusb:
   Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
   Bus 001 Device 002: ID 046b:ff10 American Megatrends, Inc. Virtual Keyboard 
and Mouse
   Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
  MachineType: Intel Corporation S1200SP
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  Package: linux (not installed)
  PciMultimedia:
   
  ProcEnviron:
   TERM=xterm
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=<set>
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 mgadrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-38-generic 
root=UUID=575c878a-0be6-4806-9c83-28f67aedea65 ro biosdevname=0 net.ifnames=0 
panic=1 verbose console=tty0 console=ttyS0,115200n8
  ProcVersionSignature: Ubuntu 4.15.0-38.41-generic 4.15.18
  RelatedPackageVersions:
   linux-restricted-modules-4.15.0-38-generic N/A
   linux-backports-modules-4.15.0-38-generic  N/A
   linux-firmware                             1.173.1
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
  Tags:  bionic
  Uname: Linux 4.15.0-38-generic x86_64
  UnreportableReason: This report is about a package that is not installed.
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups:
   
  _MarkForUpload: False
  dmi.bios.date: 01/25/2018
  dmi.bios.vendor: Intel Corporation
  dmi.bios.version: S1200SP.86B.03.01.1029.012520180838
  dmi.board.asset.tag: Base Board Asset Tag
  dmi.board.name: S1200SP
  dmi.board.vendor: Intel Corporation
  dmi.board.version: H57532-271
  dmi.chassis.asset.tag: ....................
  dmi.chassis.type: 23
  dmi.chassis.vendor: ...............................
  dmi.chassis.version: ..................
  dmi.modalias: 
dmi:bvnIntelCorporation:bvrS1200SP.86B.03.01.1029.012520180838:bd01/25/2018:svnIntelCorporation:pnS1200SP:pvr....................:rvnIntelCorporation:rnS1200SP:rvrH57532-271:cvn...............................:ct23:cvr..................:
  dmi.product.family: Family
  dmi.product.name: S1200SP
  dmi.product.version: ....................
  dmi.sys.vendor: Intel Corporation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1799497/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to