** Description changed: + [SRU justification] + This fix is required to make the crash tool usable. It does also improve makedumpfile filtering of pages. + + [Impact] + Kernel crashes cannot be analysed with the crash tool. + + [Fix] + Cherry-pick upstream commits fixing those issues. + + [Test Case] + Running crash tool on a kernel crash file will display something like : + + # crash -s usr/lib/debug/boot/vmlinux-4.8.0-34-generic + crash: read error: kernel virtual address: ffffffff81e29ff0 type: "pv_init_ops" + crash: this kernel may be configured with CONFIG_STRICT_DEVMEM, which + renders /dev/mem unusable as a live memory source. + crash: trying /proc/kcore as an alternative to /dev/mem + + crash: seek error: kernel virtual address: ffffffff81e29ff0 type: "pv_init_ops" + crash: seek error: kernel virtual address: ffffffff82166130 type: "shadow_timekeeper xtime_sec" + crash: seek error: kernel virtual address: ffffffff81e0d304 type: "init_uts_ns" + crash: usr/lib/debug/boot/vmlinux-4.8.0-34-generic and /var/crash/201701191308/dump.201701191308 do not match! + + With the fix, the crash command will work as expected + + [Regression] + None expected as those modifications are part of the Zesty and upstream version. + + [Original description of the problem] vmcore captured by kdump cannot be opened with crash: % sudo crash -d1 /usr/lib/debug/boot/vmlinux-4.8.0-34-generic /var/crash/201612282137/dump.201612282137 ... ... base kernel version: 0.8.0 linux_banner: ???????? crash: /usr/lib/debug/boot/vmlinux-4 and /var/crash/201612282137/dump.201612282137 do not match! Usage: - crash [OPTION]... NAMELIST MEMORY-IMAGE[@ADDRESS] (dumpfile form) - crash [OPTION]... [NAMELIST] (live system form) + crash [OPTION]... NAMELIST MEMORY-IMAGE[@ADDRESS] (dumpfile form) + crash [OPTION]... [NAMELIST] (live system form) Enter "crash -h" for details. - Looks like the 'linux_banner' cannot be understood by crash. And when the vmcore was dumping, this message being showed: [ 729.609196] kdump-tools[5192]: The kernel version is not supported. [ 729.609447] kdump-tools[5192]: The makedumpfile operation may be incomplete. ---uname output--- Linux roselp4 4.8.0-34-generic #36~16.04.1-Ubuntu SMP Wed Dec 21 18:53:20 UTC 2016 ppc64le ppc64le ppc64le GNU/Linux - - Machine Type = lpar - + + Machine Type = lpar + ---Debugger--- A debugger is not configured - + ---Steps to Reproduce--- - 1. config kdump + 1. config kdump 2. trigger kdump 3. analyse vmcore with crash - - Userspace tool common name: crash/makedumpfile - - The userspace tool has the following bit modes: 64-bit + + Userspace tool common name: crash/makedumpfile + + The userspace tool has the following bit modes: 64-bit Userspace rpm: makedumpfile 1.5.9-5ubuntu0.3/crash 7.1.4-1ubuntu4 - Userspace tool obtained from project website: na - - *Additional Instructions for Ping Tian Han/pt...@cn.ibm.com: + Userspace tool obtained from project website: na + + *Additional Instructions for Ping Tian Han/pt...@cn.ibm.com: -Post a private note with access information to the machine that the bug is occuring on. -Attach ltrace and strace of userspace application. xtime timespec.tv_sec: 586481e8: Wed Dec 28 21:24:24 2016 utsname: - sysname: Linux - nodename: boblp1 - release: 4.8.0-32-generic - version: #34~16.04.1-Ubuntu SMP Tue Dec 13 17:01:57 UTC 2016 - machine: ppc64le - domainname: (none) + sysname: Linux + nodename: boblp1 + release: 4.8.0-32-generic + version: #34~16.04.1-Ubuntu SMP Tue Dec 13 17:01:57 UTC 2016 + machine: ppc64le + domainname: (none) base kernel version: 4.8.0 verify_namelist: dumpfile /proc/version: Linux version 4.8.0-32-generic (buildd@bos01-ppc64el-001) (gcc version 5.4.0 20160609 (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.4) ) #34~16.04.1-Ubuntu SMP Tue Dec 13 17:01:57 UTC 2016 (Ubuntu 4.8.0-32.34~16.04.1-generic 4.8.11) /usr/lib/debug/boot/vmlinux-4.8.0-32-generic: Linux version 4.8.0-32-generic (buildd@bos01-ppc64el-001) (gcc version 5.4.0 20160609 (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.4) ) #34~16.04.1-Ubuntu SMP Tue Dec 13 17:01:57 UTC 2016 (Ubuntu 4.8.0-32.34~16.04.1-generic 4.8.11) hypervisor: (undetermined) crash: per_cpu_symbol_search(per_cpu__tvec_bases): NULL ppc64_vmemmap_init: vmemmap base: f000000000000000 crash: PPC64: cannot find 'cpu_possible_map', 'cpu_present_map', 'cpu_online_map' or 'cpu_active_map' symbols root@boblp1:/usr/lib/debug/boot# uname -a Linux boblp1 4.8.0-32-generic #34~16.04.1-Ubuntu SMP Tue Dec 13 17:01:57 UTC 2016 ppc64le ppc64le ppc64le GNU/Linux root@boblp1:/usr/lib/debug/boot# - 1. Missing v4.8 support related patches in crash tool - commit 098cdab16dfa6a85e9dad2cad604dee14ee15f66 - Author: Dave Anderson <ander...@redhat.com> - Date: Fri Feb 12 14:32:53 2016 -0500 - - Fix for the changes made to the kernel module structure introduced by - this kernel commit for Linux 4.5 and later kernels: - - commit 8244062ef1e54502ef55f54cced659913f244c3e - modules: fix longstanding /proc/kallsyms vs module insertion race. - - Without the patch, the crash session fails during initialization - with the error message: "crash: invalid structure member offset: - module_num_symtab". - (ander...@redhat.com) - - commit 6f1f78e33474d00d5f261d7ed9d835c558b34d61 - Author: Dave Anderson <ander...@redhat.com> - Date: Wed Jan 20 09:56:36 2016 -0500 - - Fix for the changes made to the kernel module structure introduced by - this kernel commit for Linux 4.5 and later kernels: - - commit 7523e4dc5057e157212b4741abd6256e03404cf1 - module: use a structure to encapsulate layout. - - Without the patch, the crash session fails during initialization - with the error message: "crash: invalid structure member offset: - module_init_text_size". - (seb...@linux.vnet.ibm.com) - - commit 1e92f9fad3a7e3042b16996306cb2335760ef8c8 - Author: Dave Anderson <ander...@redhat.com> - Date: Mon Feb 1 16:10:49 2016 -0500 - - Fix for the replacements made to the kernel's cpu_possible_mask, - cpu_online_mask, cpu_present_mask and cpu_active_mask symbols in - this kernel commit for Linux 4.5 and later kernels: - - commit 5aec01b834fd6f8ca49d1aeede665b950d0c148e - kernel/cpu.c: eliminate cpu_*_mask - - Without the patch, behavior is architecture-specific, dependent upon - whether the cpu mask values are used to calculate the number of cpus. - For example, ARM64 crash sessions fail during session initialization - with the error message "crash: zero-size memory allocation! (called - from <address>)", whereas X86_64 sessions come up normally, but - cpu mask values of zero are stored internally. - (ander...@redhat.com) - - commit 182914debbb9a2671ef644027fedd339aa9c80e0 - Author: Dave Anderson <ander...@redhat.com> - Date: Fri Sep 23 09:09:15 2016 -0400 - - With the introduction of radix MMU in Power ISA 3.0, there are - changes in kernel page table management accommodating it. This patch - series makes appropriate changes here to work for such kernels. - Also, this series fixes a few bugs along the way: - - ppc64: fix vtop page translation for 4K pages - ppc64: Use kernel terminology for each level in 4-level page table - ppc64/book3s: address changes in kernel v4.5 - ppc64/book3s: address change in page flags for PowerISA v3.0 - ppc64: use physical addresses and unfold pud for 64K page size - ppc64/book3s: support big endian Linux page tables - - The patches are needed for Linux v4.5 and later kernels on all - ppc64 hardware. - commit 8ceb1ac628bf6a0a7f0bbfff030ec93081bca4cd - Author: Dave Anderson <ander...@redhat.com> - Date: Mon May 23 11:23:01 2016 -0400 - - Fix for Linux commit 0139aa7b7fa12ceef095d99dc36606a5b10ab83a, which - renamed the page._count member to page._refcount. Without the patch, - certain "kmem" commands fail with the "kmem: invalid structure member - offset: page_count". - (ander...@redhat.com) - - commit 7136bf8495948cb059e5595b8503f8ae37019fa1 - Author: Dave Anderson <ander...@redhat.com> - Date: Thu May 19 14:01:19 2016 -0400 - - Fix for Linux commit edf14cdbf9a0e5ab52698ca66d07a76ade0d5c46, which - has appended a NULL entry as the final member of the pageflag_names[] - array. Without the patch, a message that indicates "crash: failed to - read pageflag_names entry" is displayed during session initialization - in Linux 4.6 kernels. - (andrej.skvort...@gmail.com) - + commit 098cdab16dfa6a85e9dad2cad604dee14ee15f66 + Author: Dave Anderson <ander...@redhat.com> + Date: Fri Feb 12 14:32:53 2016 -0500 + + Fix for the changes made to the kernel module structure introduced by + this kernel commit for Linux 4.5 and later kernels: + + commit 8244062ef1e54502ef55f54cced659913f244c3e + modules: fix longstanding /proc/kallsyms vs module insertion race. + + Without the patch, the crash session fails during initialization + with the error message: "crash: invalid structure member offset: + module_num_symtab". + (ander...@redhat.com) + + commit 6f1f78e33474d00d5f261d7ed9d835c558b34d61 + Author: Dave Anderson <ander...@redhat.com> + Date: Wed Jan 20 09:56:36 2016 -0500 + + Fix for the changes made to the kernel module structure introduced by + this kernel commit for Linux 4.5 and later kernels: + + commit 7523e4dc5057e157212b4741abd6256e03404cf1 + module: use a structure to encapsulate layout. + + Without the patch, the crash session fails during initialization + with the error message: "crash: invalid structure member offset: + module_init_text_size". + (seb...@linux.vnet.ibm.com) + + commit 1e92f9fad3a7e3042b16996306cb2335760ef8c8 + Author: Dave Anderson <ander...@redhat.com> + Date: Mon Feb 1 16:10:49 2016 -0500 + + Fix for the replacements made to the kernel's cpu_possible_mask, + cpu_online_mask, cpu_present_mask and cpu_active_mask symbols in + this kernel commit for Linux 4.5 and later kernels: + + commit 5aec01b834fd6f8ca49d1aeede665b950d0c148e + kernel/cpu.c: eliminate cpu_*_mask + + Without the patch, behavior is architecture-specific, dependent upon + whether the cpu mask values are used to calculate the number of cpus. + For example, ARM64 crash sessions fail during session initialization + with the error message "crash: zero-size memory allocation! (called + from <address>)", whereas X86_64 sessions come up normally, but + cpu mask values of zero are stored internally. + (ander...@redhat.com) + + commit 182914debbb9a2671ef644027fedd339aa9c80e0 + Author: Dave Anderson <ander...@redhat.com> + Date: Fri Sep 23 09:09:15 2016 -0400 + + With the introduction of radix MMU in Power ISA 3.0, there are + changes in kernel page table management accommodating it. This patch + series makes appropriate changes here to work for such kernels. + Also, this series fixes a few bugs along the way: + + ppc64: fix vtop page translation for 4K pages + ppc64: Use kernel terminology for each level in 4-level page table + ppc64/book3s: address changes in kernel v4.5 + ppc64/book3s: address change in page flags for PowerISA v3.0 + ppc64: use physical addresses and unfold pud for 64K page size + ppc64/book3s: support big endian Linux page tables + + The patches are needed for Linux v4.5 and later kernels on all + ppc64 hardware. + commit 8ceb1ac628bf6a0a7f0bbfff030ec93081bca4cd + Author: Dave Anderson <ander...@redhat.com> + Date: Mon May 23 11:23:01 2016 -0400 + + Fix for Linux commit 0139aa7b7fa12ceef095d99dc36606a5b10ab83a, which + renamed the page._count member to page._refcount. Without the patch, + certain "kmem" commands fail with the "kmem: invalid structure member + offset: page_count". + (ander...@redhat.com) + + commit 7136bf8495948cb059e5595b8503f8ae37019fa1 + Author: Dave Anderson <ander...@redhat.com> + Date: Thu May 19 14:01:19 2016 -0400 + + Fix for Linux commit edf14cdbf9a0e5ab52698ca66d07a76ade0d5c46, which + has appended a NULL entry as the final member of the pageflag_names[] + array. Without the patch, a message that indicates "crash: failed to + read pageflag_names entry" is displayed during session initialization + in Linux 4.6 kernels. + (andrej.skvort...@gmail.com) 2. The following makedumpfile commits are needed: - commit 5bc1f520cc7ab6e18abdd5af21c80ecda6339eb5 - Author: Atsushi Kumagai <ats-kuma...@wm.jp.nec.com> - Date: Tue Jan 26 10:11:33 2016 +0900 - - [PATCH] Looking for page.compound_order/compound_dtor to exclude hugepages - - * Required for kernel 4.4 - - Due to some changes in struct page, hugepages wouldn't be removed on - linux 4.4. makedumpfile reads page.lru.prev to get "order" (number of hugepages) - and page.lru.next to get "dtor" (destructor for hugepages) to detect hugepages, - but the offsets of the two was changed in linux 4.4. - - kernel version | where is order | where is dtor - ----------------+---------------------------+--------------------------- - - v3.19 | lru.prev | lru.next - v4.0 - v4.3 | compound_order(=lru.prev) | compound_dtor(=lru.next) - v4.4 - | compound_order | compound_dtor - - As above, OFFSET(page.compound_order) and OFFSET(page.compound_dtor) are - definitely necessary in VMCOREINFO on linux 4.4 and later. - - Further, the content of page.compound_dtor was changed from direct address - of dtor to the ID of it in linux 4.4. - - Signed-off-by: Atsushi Kumagai <ats-kuma...@wm.jp.nec.com> - - commit 13b4233e91a9d5aa14c4b0643af36cbc29b9fa7a - Author: Atsushi Kumagai <ats-kuma...@wm.jp.nec.com> - Date: Wed Feb 24 17:09:44 2016 +0900 - - [PATCH] Skip examining compound tail pages - - * Required for kernel 4.5 - - For filtering user pages, we check whether each page's - page->mapping have PAGE_MAPPING_ANON bit. - However, unexcludable compound tail pages can have - PAGE_MAPPING_ANON since kernel 4.5, they can be excluded - as user page wrong. - - Now, we don't need to check compound tail pages because - excludable compound pages must be excluded at a time by - exclude_range() when the corresponding head page is checked. - So just skipping tail pages can avoid wrong filtering. - - Signed-off-by: Atsushi Kumagai <ats-kuma...@wm.jp.nec.com> - + commit 5bc1f520cc7ab6e18abdd5af21c80ecda6339eb5 + Author: Atsushi Kumagai <ats-kuma...@wm.jp.nec.com> + Date: Tue Jan 26 10:11:33 2016 +0900 + + [PATCH] Looking for page.compound_order/compound_dtor to exclude + hugepages + + * Required for kernel 4.4 + + Due to some changes in struct page, hugepages wouldn't be removed on + linux 4.4. makedumpfile reads page.lru.prev to get "order" (number of hugepages) + and page.lru.next to get "dtor" (destructor for hugepages) to detect hugepages, + but the offsets of the two was changed in linux 4.4. + + kernel version | where is order | where is dtor + ----------------+---------------------------+--------------------------- + - v3.19 | lru.prev | lru.next + v4.0 - v4.3 | compound_order(=lru.prev) | compound_dtor(=lru.next) + v4.4 - | compound_order | compound_dtor + + As above, OFFSET(page.compound_order) and OFFSET(page.compound_dtor) are + definitely necessary in VMCOREINFO on linux 4.4 and later. + + Further, the content of page.compound_dtor was changed from direct address + of dtor to the ID of it in linux 4.4. + + Signed-off-by: Atsushi Kumagai <ats-kuma...@wm.jp.nec.com> + + commit 13b4233e91a9d5aa14c4b0643af36cbc29b9fa7a + Author: Atsushi Kumagai <ats-kuma...@wm.jp.nec.com> + Date: Wed Feb 24 17:09:44 2016 +0900 + + [PATCH] Skip examining compound tail pages + + * Required for kernel 4.5 + + For filtering user pages, we check whether each page's + page->mapping have PAGE_MAPPING_ANON bit. + However, unexcludable compound tail pages can have + PAGE_MAPPING_ANON since kernel 4.5, they can be excluded + as user page wrong. + + Now, we don't need to check compound tail pages because + excludable compound pages must be excluded at a time by + exclude_range() when the corresponding head page is checked. + So just skipping tail pages can avoid wrong filtering. + + Signed-off-by: Atsushi Kumagai <ats-kuma...@wm.jp.nec.com> 3. The linux-image dbgsym version installed must be pulled from a different repo - instead of the one meant for 16.04.2 because the gcc version of kernel - image (/boot/vmlinux-4.8.0-34-generic) and the vmlinux with debug - symbols(usr/lib/debug/boot/vmlinux-4.8.0-34-generic) don't match. - - Please use the following repos - - sudo tee /etc/apt/sources.list.d/ddebs.list << EOF - deb http://ddebs.ubuntu.com/ $(lsb_release -cs) main restricted universe multiverse - deb http://ddebs.ubuntu.com/ $(lsb_release -cs)-security main restricted universe multiverse - deb http://ddebs.ubuntu.com/ $(lsb_release -cs)-updates main restricted universe multiverse - deb http://ddebs.ubuntu.com/ $(lsb_release -cs)-proposed main restricted universe multiverse - EOF - - to install linux-image-4.8.0-34-generic-dbgsym package. + instead of the one meant for 16.04.2 because the gcc version of kernel + image (/boot/vmlinux-4.8.0-34-generic) and the vmlinux with debug + symbols(usr/lib/debug/boot/vmlinux-4.8.0-34-generic) don't match. + + Please use the following repos + + sudo tee /etc/apt/sources.list.d/ddebs.list << EOF + deb http://ddebs.ubuntu.com/ $(lsb_release -cs) main restricted universe multiverse + deb http://ddebs.ubuntu.com/ $(lsb_release -cs)-security main restricted universe multiverse + deb http://ddebs.ubuntu.com/ $(lsb_release -cs)-updates main restricted universe multiverse + deb http://ddebs.ubuntu.com/ $(lsb_release -cs)-proposed main restricted universe multiverse + EOF + + to install linux-image-4.8.0-34-generic-dbgsym package. Thanks - [snip] - > + > > 3. The linux-image dbgsym version installed must be pulled from a different > repo s/must be pulled/must have been pulled/ Applied crash utility's missing patches on top of crash-7.1.4-1ubuntu4 and makedumpfile tool's missing patches on top of makedumpfile-1.5.9-5ubuntu0.3. Did some sanity testing of the patched binaries. The binaries were working as expected.
-- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1655625 Title: ISST-LTE:pVM:roselp4:ubuntu 16.04.2: vmcore cannot be analysed by crash To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/crash/+bug/1655625/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs