------- Comment From hbath...@in.ibm.com 2016-10-04 05:08 EDT------- (In reply to comment #27) > Hi > > Trying to Verify this bug on Ubuntu16.10 ( 4.8.0-17) now dump captured > but makedumpfile failed so dump was in incomplete. > > LOG: > > 1;-1f[ 92.085437] kdump-tools[10063]: > Starting kdump-tools: * running makedumpfile -c -d 31 /proc/vmcore / > p-incomplete > > [ 92.095715] kdump-tools[10063]: get_mem_map: Can't distinguish the memory > type. >
Bug 146571 / LP Bug 1626269 is being used to track this.. Thanks Hari -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1627036 Title: In Ubuntu16.10:Fadump fails as Kernel panic reported while dumping-,console got hung on 32TB Brazos System (kdump) Status in linux package in Ubuntu: Triaged Bug description: == Comment: #0 - Praveen K. Pandey <praveen.pan...@in.ibm.com> - 2016-07-17 02:37:31 == Hi In Ubuntu16.10 I I tried fadump in Brazos system (32TB Memory and 192 core) , when trigger panic in kernel panic occur and console got hung. Reproducible Step: 1- Install Ubuntu16.10 2- boot system with 31TB and 192 Core 3- configure fadump in system 4- verify fadump in system that it is running 5- Trigger panic in system Actual Result Not able to take Fadump , kernel panic and console got hung Expected Result Fadump will be captured Log: root@ltc-brazos1:~# kdump-config show DUMP_MODE: fadump USE_KDUMP: 1 KDUMP_SYSCTL: kernel.panic_on_oops=1 KDUMP_COREDIR: /var/crash /var/lib/kdump/vmlinuz: symbolic link to /boot/vmlinux-4.4.0-30-generic kdump initrd: /var/lib/kdump/initrd.img: symbolic link to /var/lib/kdump/initrd.img-4.4.0-30-generic current state: ready to fadump root@ltc-brazos1:~# root@ltc-brazos1:~# cat /proc/cmdline BOOT_IMAGE=/boot/vmlinux-4.4.0-30-generic root=UUID=516c4b1b-6700-4b55-bd37-d61c4c5af6af ro quiet splash fadump=on fadump_reserve_mem=4096M crashkernel=4096M root@ltc-brazos1:~# ltc-brazos1 login: [ 442.749993] sysrq: SysRq : Trigger a crash [ 442.750031] Unable to handle kernel paging request for data at address 0x00000000 [ 442.750037] Faulting instruction address: 0xc000000000670014 [ 442.750043] Oops: Kernel access of bad area, sig: 11 [#1] [ 442.750047] SMP NR_CPUS=2048 NUMA pSeries [ 442.750053] Modules linked in: pseries_rng btrfs xor raid6_pq rtc_generic sunrpc autofs4 ses enclosure ipr [ 442.750068] CPU: 157 PID: 403890 Comm: bash Not tainted 4.4.0-30-generic #49-Ubuntu [ 442.750074] task: c00003f97b0af640 ti: c00003f97b104000 task.ti: c00003f97b104000 [ 442.750079] NIP: c000000000670014 LR: c0000000006710c8 CTR: c00000000066ffe0 [ 442.750083] REGS: c00003f97b107990 TRAP: 0300 Not tainted (4.4.0-30-generic) [ 442.750088] MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 28242222 XER: 00000001 [ 442.750100] CFAR: c000000000008468 DAR: 0000000000000000 DSISR: 42000000 SOFTE: 1 GPR00: c0000000006710c8 c00003f97b107c10 c0000000015b5d00 0000000000000063 GPR04: c00001faba749c50 c00001faba75b4e0 c0001f3efe7c0000 0000000000000313 GPR08: 0000000000000007 0000000000000001 0000000000000000 c0001f3efe7cecb8 GPR12: c00000000066ffe0 c00000000bc9d380 ffffffffffffffff 0000000022000000 GPR16: 0000000010170dc8 000001001ef401d8 0000000010140f58 00000000100c7570 GPR20: 0000000000000000 000000001017dd58 0000000010153618 000000001017b608 GPR24: 00003ffff7c9e7b4 0000000000000001 c0000000014f8e58 0000000000000004 GPR28: c0000000014f9218 0000000000000063 c0000000014b11dc 0000000000000000 [ 442.750165] NIP [c000000000670014] sysrq_handle_crash+0x34/0x50 [ 442.750170] LR [c0000000006710c8] __handle_sysrq+0xe8/0x270 [ 442.750174] Call Trace: [ 442.750179] [c00003f97b107c10] [c000000000e08f28] _fw_tigon_tg3_bin_name+0x2ce58/0x342b0 (unreliable) [ 442.750186] [c00003f97b107c30] [c0000000006710c8] __handle_sysrq+0xe8/0x270 [ 442.750192] [c00003f97b107cd0] [c000000000671868] write_sysrq_trigger+0x78/0xa0 [ 442.750199] [c00003f97b107d00] [c00000000037ae30] proc_reg_write+0xb0/0x110 [ 442.750205] [c00003f97b107d50] [c0000000002e186c] __vfs_write+0x6c/0xe0 [ 442.750210] [c00003f97b107d90] [c0000000002e25a0] vfs_write+0xc0/0x230 [ 442.750216] [c00003f97b107de0] [c0000000002e35dc] SyS_write+0x6c/0x110 [ 442.750222] [c00003f97b107e30] [c000000000009204] system_call+0x38/0xb4 [ 442.750226] Instruction dump: [ 442.750229] 38425d20 7c0802a6 f8010010 f821ffe1 60000000 60000000 3d220019 394931e4 [ 442.750238] 39200001 912a0000 7c0004ac 39400000 <992a0000> 38210020 e8010010 7c0803a6 [ 442.750248] ---[ end trace ff61e1bc4dd59a42 ]--- [ 442.752585] Loading Linux 4.4.0-30-generic ... Loading initial ramdisk ... OF stdout device is: /vdevice/vty@30000000 Preparing to boot Linux version 4.4.0-30-generic (buildd@bos01-ppc64el-023) (gcc version 5.3.1 20160413 (Ubuntu/IBM 5.3.1-14ubuntu2.1) ) #49-Ubuntu SMP Fri Jul 1 10:00:36 UTC 2016 (Ubuntu 4.4.0-30.49-generic 4.4.13) Detected machine type: 0000000000000101 Max number of cores passed to firmware: 256 (NR_CPUS = 2048) Calling ibm,client-architecture-support... done command line: BOOT_IMAGE=/boot/vmlinux-4.4.0-30-generic root=UUID=516c4b1b-6700-4b55-bd37-d61c4c5af6af ro quiet splash fadump=on fadump_reserve_mem=4096M crashkernel=4096M Ignoring mem=0000000100000000 >= ram_top. memory layout at init: memory_limit : 0000000000000000 (16 MB aligned) alloc_bottom : 000000000e020000 alloc_top : 0000000010000000 alloc_top_hi : 0000000010000000 rmo_top : 0000000010000000 ram_top : 0000000010000000 instantiating rtas at 0x000000000e9e0000... done prom_hold_cpus: skipped copying OF device tree... Building dt strings... Building dt structure... Device tree strings 0x000000000e030000 -> 0x000000000e0319a4 Device tree struct 0x000000000e040000 -> 0x000000000e640000 Quiescing Open Firmware ... Booting Linux via __start() ... -> smp_release_cpus() spinning_secondaries = 1535 <- smp_release_cpus() <- setup_system() [ 0.000000] Kernel panic - not syncing: memblock_virt_alloc_try_nid: Failed to allocate 16777216 bytes align=0x1000000 nid=1 from=0xfffffffffffffff max_addr=0x0 [ 0.000000] [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.4.0-30-generic #49-Ubuntu [ 0.000000] Call Trace: [ 0.000000] [c0000000015b39d0] [c000000000af955c] dump_stack+0xb0/0xf0 (unreliable) [ 0.000000] [c0000000015b3a10] [c000000000af5790] panic+0x100/0x2c0 [ 0.000000] [c0000000015b3aa0] [c000000000ed238c] memblock_virt_alloc_try_nid+0xc0/0xe8 [ 0.000000] [c0000000015b3b30] [c0000000002db69c] __earlyonly_bootmem_alloc.constprop.2+0x50/0x74 [ 0.000000] [c0000000015b3b70] [c000000000afc5fc] vmemmap_populate+0xf8/0x250 [ 0.000000] [c0000000015b3c40] [c000000000afdfa8] sparse_mem_map_populate+0x38/0x64 [ 0.000000] [c0000000015b3c70] [c000000000ed4234] sparse_init+0x1d4/0x298 [ 0.000000] [c0000000015b3d30] [c000000000eb3604] initmem_init+0xabc/0xd68 [ 0.000000] [c0000000015b3e50] [c000000000eab418] setup_arch+0x270/0x300 [ 0.000000] [c0000000015b3f00] [c000000000ea3ae4] start_kernel+0xc4/0x558 [ 0.000000] [c0000000015b3f90] [c000000000008c6c] start_here_common+0x20/0xa8 [ 0.000000] ---[ end Kernel panic - not syncing: memblock_virt_alloc_try_nid: Failed to allocate 16777216 bytes align=0x1000000 nid=1 from=0xfffffffffffffff max_addr=0x0 [ 0.000000] Regards Praveen == Comment: #1 - Praveen K. Pandey <praveen.pan...@in.ibm.com> - 2016-07-17 02:40:23 == == Comment: #14 - SRIKAR DRONAMRAJU <srikar.dronamr...@in.ibm.com> - 2016-08-31 11:02:28 == V3 was posted upstream at http://lkml.kernel.org/r/1472476010-4709-1-git-send-email-sri...@linux.vnet.ibm.com. That should atleast solve the problem (atleast it wouldnt panic/hang on triggering fadump) The patches posted were on top of 4.8-rc3 and apply cleanly on v4.4 I am not sure what is the kernel targeted for 16.10. I hear its going to be based on v4.8 Once we know which kernel version ubuntu is targeting we can backport the patchset accordingly. == Comment: #18 - Gary M. Gaydos <gmgay...@us.ibm.com> - 2016-09-14 16:56:11 == Hi Canonical: Per this comment with patch set link, this bug appears to be fixed using the 4.40-34 kernel. Of course the 16.10 release will use a newer kernel. V3 was posted upstream at http://lkml.kernel.org/r/1472476010-4709-1 -git-send-email-sri...@linux.vnet.ibm.com. That should atleast solve the problem (atleast it wouldnt panic/hang on triggering fadump) The patches posted were on top of 4.8-rc3 and apply cleanly on v4.4 I am not sure what is the kernel targeted for 16.10. I hear its going to be based on v4.8 Once we know which kernel version ubuntu is targeting we can backport the patchset accordingly. Exposing a comment from test that was previously private: (In reply to comment #16) > Hi Praveen, > > I have applied the patches to the Yakkety kernel source and built the *.deb > files. I have kept them on powerdev.in.ibm.com. Have sent you the access > details over email Hi latha , Thanks i tried with patched kernel and seems me issue is fixed . able to capture FAdump . Log: root@ltc-brazos1:~# cat /proc/cmdline BOOT_IMAGE=/boot/vmlinux-4.4.0-34-generic root=UUID=bfdd4041-1b2f-42b1-b202-2c09f781bbcc ro fadump=on quiet splash fadump=on crashkernel=384M-:128M root@ltc-brazos1:~# root@ltc-brazos1:/var/crash# ls 201609140950 kexec_cmd linux-image-4.4.0-34-generic-201609140950.crash root@ltc-brazos1:/var/crash# cd 201609140950 root@ltc-brazos1:/var/crash/201609140950# ls dmesg.201609140950 dump.201609140950 root@ltc-brazos1:/var/crash/201609140950# Regards Praveen == Comment: #20 - Hari Krishna Bathini <hbath...@in.ibm.com> - 2016-09-23 03:49:36 == Mirror the bug so Canonical can pick the fix patches. Srikar, can you please provide the upstream commit ids of the fix patches.. Thanks Hari == Comment: #21 - Hari Krishna Bathini <hbath...@in.ibm.com> - 2016-09-23 03:59:17 == (In reply to comment #14) > V3 was posted upstream at > http://lkml.kernel.org/r/1472476010-4709-1-git-send-email-sri...@linux.vnet. > ibm.com. > > That should atleast solve the problem (atleast it wouldnt panic/hang on > triggering fadump) > > The patches posted were on top of 4.8-rc3 and apply cleanly on v4.4 > I am not sure what is the kernel targeted for 16.10. I hear its going to be > based on v4.8 Yeah. 16.10 -proposed now has v4.8 based kernel.. Thanks Hari To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1627036/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp