That's an OOM problem. Just increase the reserved crashkernel memory
size and test again, please.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to makedumpfile in Ubuntu.
https://bugs.launchpad.net/bugs/1778844

Title:
  nvme multipath does not report path relationships

Status in The Ubuntu-power-systems project:
  Won't Fix
Status in initramfs-tools package in Ubuntu:
  In Progress
Status in linux package in Ubuntu:
  Invalid
Status in makedumpfile package in Ubuntu:
  Invalid
Status in initramfs-tools source package in Bionic:
  In Progress
Status in linux source package in Bionic:
  Invalid
Status in makedumpfile source package in Bionic:
  Invalid
Status in initramfs-tools source package in Cosmic:
  In Progress
Status in linux source package in Cosmic:
  Invalid
Status in makedumpfile source package in Cosmic:
  Invalid
Status in initramfs-tools source package in Disco:
  In Progress
Status in linux source package in Disco:
  Invalid
Status in makedumpfile source package in Disco:
  Invalid

Bug description:
  Problem Description:
  ===================
  After triggering crash ,kdump is not working & system enters into initramfs 
state

  Steps to re-create:
  ==================

  >. woo is installed ubuntu180401 kernel

  root@woo:~# uname -a
  Linux woo 4.15.0-23-generic #25-Ubuntu SMP Wed May 23 17:59:00 UTC 2018 
ppc64le ppc64le ppc64le GNU/Linux
  root@woo:~#

  >. Crashkernel value as below

  root@woo:~# free -h
                total        used        free      shared  buff/cache   
available
  Mem:           503G        2.0G        501G         13M        279M        
499G
  Swap:          2.0G          0B        2.0G

  root@woo:~# cat /proc/cmdline
  root=UUID=45bb7eb2-4c61-425d-8bf9-4e6f16829ddb ro splash quiet 
crashkernel=8192M

  >  kdump status

  root@woo:~#  kdump-config status
  current state   : ready to kdump

  root@woo:~#  kdump-config show
  DUMP_MODE:        kdump
  USE_KDUMP:        1
  KDUMP_SYSCTL:     kernel.panic_on_oops=1
  KDUMP_COREDIR:    /var/crash
  crashkernel addr:
     /var/lib/kdump/vmlinuz: symbolic link to /boot/vmlinux-4.15.0-23-generic
  kdump initrd:
     /var/lib/kdump/initrd.img: symbolic link to 
/var/lib/kdump/initrd.img-4.15.0-23-generic
  current state:    ready to kdump

  kexec command:
    /sbin/kexec -p 
--command-line="root=UUID=45bb7eb2-4c61-425d-8bf9-4e6f16829ddb ro splash quiet 
nr_cpus=1 systemd.unit=kdump-tools.service irqpoll noirqdistrib nousb" 
--initrd=/var/lib/kdump/initrd.img /var/lib/kdump/vmlinuz

  root@woo:~# dmesg | grep Reser
  [    0.000000] Reserving 8192MB of memory at 128MB for crashkernel (System 
RAM: 524288MB)
  [    0.000000] cma: Reserved 26224 MiB at 0x0000203995000000
  [    3.545490] Copyright (C) 2017-2018 Broadcom. All Rights Reserved. The 
term "Broadcom" refers to Broadcom Limited and/or its subsidiaries.

  > Triggered crash

  root@woo:~# echo 1 > /proc/sys/kernel/sysrq
  root@woo:~# echo c > /proc/sysrq-trigger
  [   73.056308] sysrq: SysRq : Trigger a crash
  [   73.056357] Unable to handle kernel paging request for data at address 
0x00000000
  [   73.056459] Faulting instruction address: 0xc0000000007f24c8
  [   73.056543] Oops: Kernel access of bad area, sig: 11 [#1]
  [   73.056609] LE SMP NR_CPUS=2048 NUMA PowerNV
  [   73.056668] Modules linked in: rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) 
iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_uverbs(OE) ib_umad(OE) esp6_offload esp6 
esp4_offload esp4 xfrm_algo mlx5_fpga_tools(OE) mlx4_en(OE) mlx4_ib(OE) 
mlx4_core(OE) rpcsec_gss_krb5 nfsv4 nfs fscache binfmt_misc idt_89hpesx 
vmx_crypto crct10dif_vpmsum ofpart cmdlinepart ipmi_powernv ipmi_devintf at24 
powernv_flash ipmi_msghandler ibmpowernv mtd opal_prd uio_pdrv_genirq uio nfsd 
auth_rpcgss nfs_acl lockd sch_fq_codel grace sunrpc knem(OE) ip_tables x_tables 
autofs4 btrfs xor zstd_compress raid6_pq mlx5_ib(OE) ib_core(OE) nouveau lpfc 
ast i2c_algo_bit ttm mlx5_core(OE) drm_kms_helper mlxfw(OE) nvmet_fc devlink 
syscopyarea nvmet mlx_compat(OE) sysfillrect cxl nvme_fc sysimgblt fb_sys_fops 
nvme_fabrics nvme ahci crc32c_vpmsum drm scsi_transport_fc
  [   73.057601]  tg3 libahci nvme_core pnv_php
  [   73.057652] CPU: 44 PID: 4626 Comm: bash Tainted: G           OE    
4.15.0-23-generic #25-Ubuntu
  [   73.057767] NIP:  c0000000007f24c8 LR: c0000000007f3568 CTR: 
c0000000007f24a0
  [   73.057868] REGS: c000003f8269f9f0 TRAP: 0300   Tainted: G           OE    
 (4.15.0-23-generic)
  [   73.057986] MSR:  9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR: 28222222 
 XER: 20040000
  [   73.058099] CFAR: c0000000007f3564 DAR: 0000000000000000 DSISR: 42000000 
SOFTE: 1
  [   73.058099] GPR00: c0000000007f3568 c000003f8269fc70 c0000000016eaf00 
0000000000000063
  [   73.058099] GPR04: c000003fef47ce18 c000003fef494368 9000000000009033 
0000000031da0058
  [   73.058099] GPR08: 0000000000000007 0000000000000001 0000000000000000 
9000000000001003
  [   73.058099] GPR12: c0000000007f24a0 c000000007a2e400 00000e4fa497c900 
0000000000000000
  [   73.058099] GPR16: 00000e4f79cc94b0 00000e4f79d567e0 00000e4f79d88204 
00000e4f79d56818
  [   73.058099] GPR20: 00000e4f79d8d5d8 0000000000000001 0000000000000000 
00007ffffefce644
  [   73.058099] GPR24: 00007ffffefce640 00000e4f79d8afb4 c0000000015e9aa8 
0000000000000002
  [   73.058099] GPR28: 0000000000000063 0000000000000004 c000000001572b1c 
c0000000015e9e68
  [   73.059060] NIP [c0000000007f24c8] sysrq_handle_crash+0x28/0x30
  [   73.059142] LR [c0000000007f3568] __handle_sysrq+0xf8/0x2c0
  [   73.059215] Call Trace:
  [   73.059254] [c000003f8269fc70] [c0000000007f3548] 
__handle_sysrq+0xd8/0x2c0 (unreliable)
  [   73.059358] [c000003f8269fd10] [c0000000007f3d74] 
write_sysrq_trigger+0x64/0x90
  [   73.059456] [c000003f8269fd40] [c000000000481248] proc_reg_write+0x88/0xd0
  [   73.059543] [c000003f8269fd70] [c0000000003d43fc] __vfs_write+0x3c/0x70
  [   73.059627] [c000003f8269fd90] [c0000000003d4658] vfs_write+0xd8/0x220
  [   73.059716] [c000003f8269fde0] [c0000000003d4978] SyS_write+0x68/0x110
  [   73.059809] [c000003f8269fe30] [c00000000000b284] system_call+0x58/0x6c
  [   73.059896] Instruction dump:
  [   73.059940] 4bfff9f1 4bfffe50 3c4c00f0 38428a60 7c0802a6 60000000 39200001 
3d42001c
  [   73.060040] 394a6db0 912a0000 7c0004ac 39400000 <992a0000> 4e800020 
3c4c00f0 38428a30
  [   73.060159] ---[ end trace e116d2421d2f59a5 ]---
  [   74.067059]
  [   74.067172] Sending IPI to other CPUs
  [   75.851509[  202.275317797,5] OPAL: Switch to big-endian OS
  ] IPI complet[  207.151277658,5] OPAL: Switch to little-endian OS
  [  232.159296542,3] PHB#0033[8:3]: CRESET: Unexpected slot state 00000102, 
resetting...
  e
  [   78.164472] kexec: Starting switchover sequence.
  [    1.412463] integrity: Unable to open file: /etc/keys/x509_ima.der (-2)
  [    1.412468] integrity: Unable to open file: /etc/keys/x509_evm.der (-2)
  [    1.481335] vio vio: uevent: failed to send synthetic uevent
  [    2.534732] nouveau 0004:04:00.0: unknown chipset (140000a1)
  [    2.534847] nouveau 0004:05:00.0: unknown chipset (140000a1)
  [    2.534967] nouveau 0035:03:00.0: unknown chipset (140000a1)
  [    2.535144] nouveau 0035:04:00.0: unknown chipset (140000a1)

  
  Gave up waiting for root file system device.  Common problems:
   - Boot args (cat /proc/cmdline)
     - Check rootdelay= (did the system wait long enough?)
   - Missing modules (cat /proc/modules; ls /dev)
  ALERT!  UUID=45bb7eb2-4c61-425d-8bf9-4e6f16829ddb does not exist.  Dropping 
to a shell!

  
  BusyBox v1.27.2 (Ubuntu 1:1.27.2-2ubuntu3) built-in shell (ash)
  Enter 'help' for a list of built-in commands.

  (initramfs)
  (initramfs)

  
  == Comment: #1 - INDIRA P. JOGA <> - 2018-06-21 01:19:41 ==
  Attached woo console logs for kdump issue

  == Comment: #4 - INDIRA P. JOGA <> - 2018-06-26 01:51:47 ==
  I have triggered crash & sits here

  [ 1032.259696471,3] PHB#0033[8:3]: CRESET: Unexpected slot state 00000102, 
resetting...
  omplete
  [  823.882048] kexec: Starting switchover sequence.
  [    1.154056] integrity: Unable to open file: /etc/keys/x509_ima.der (-2)
  [    1.154060] integrity: Unable to open file: /etc/keys/x509_evm.der (-2)
  [    1.222719] vio vio: uevent: failed to send synthetic uevent
  [    2.212065] nouveau 0004:04:00.0: unknown chipset (140000a1)
  [    2.214995] nouveau 0004:05:00.0: unknown chipset (140000a1)
  [    2.215259] nouveau 0035:03:00.0: unknown chipset (140000a1)
  [    2.215408] nouveau 0035:04:00.0: unknown chipset (140000a1)
  Gave up waiting for root file system device.  Common problems:
   - Boot args (cat /proc/cmdline)
     - Check rootdelay= (did the system wait long enough?)
   - Missing modules (cat /proc/modules; ls /dev)
  ALERT!  UUID=45bb7eb2-4c61-425d-8bf9-4e6f16829ddb does not exist.  Dropping 
to a shell!

  
  BusyBox v1.27.2 (Ubuntu 1:1.27.2-2ubuntu3) built-in shell (ash)
  Enter 'help' for a list of built-in commands.

  (initramfs)


  == Comment: #5 - INDIRA P. JOGA <> - 2018-06-26 02:40:56 ==
  NOTE:

  Used nvme disk as root disk here.

  
  == Comment: #8 - Hari Krishna Bathini <> - 2018-06-26 06:06:13 ==
  The dump target (/var/crash) is on NVMe device (also, the root disk).
  But the kdump initrd is not being built with nvme driver modules.
  Eventually, nvme disk is not found and kdump kernel is hitting the
  initramfs shell. Using the default initrd, which has the nvme driver
  modules included, dump was captured successfully.

  Can someone from Canonical take a look at this and comment on why
  nvme modules are not included in kdump initrd despite it being the
  root disk..

  
  Thanks
  Hari

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1778844/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to