Hi Kellen, thanks a lot for reporting and fixing that!

I'd like to take the opportunity to discuss something related: no matter
how many bugs we fix in makedumpfile / crash, more will come as kernel
version bumps. Kernel has no stable ABI, so kernel developers can
"break" compatibility with such tools, although makedumpfile maintainer
(and crash's as well!) are really great in keep up with that and release
proactive fixes even before the kernel change is merged.

But the problem is: in Ubuntu ecosystem, despite we have the HWE concept
for kernel, these packages are not part of kernel HWE upgrades; hence,
they get "stuck" and subject to bugs when kernel HWE is released. It
happens all the time and will continue happening...

We had discussions in the past (and I'm hereby CCing the interested
parties: DannF, Dan Streetman, Heitor and Cascardo) about sync'ing
makedumpfile and crash with kernel HWE upgrades. So, that might be a
good opportunity for doing it.

The idea was more or less like this: update makedump/crash on Release to make 
it sync'ed with Release +1 until the next LTS. So, in the end, we'll have LTS 
version == LTS +1 and then, we stop upgrading/syncing these packages. And the 
cycle restarts for LTS+1, up to the release of LTS+2.
Hopefully this plan (or something similar) eventually is followed, I bet all 
users/customers would be glad to not face makedump/crash bugs due to kernel 
upgrades anymore!

Cheers, and thanks for the attention =D

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to makedumpfile in Ubuntu.
https://bugs.launchpad.net/bugs/1970672

Title:
  makedumpfile falls back to cp with "__vtop4_x86_64: Can't get a valid
  pmd_pte."

Status in makedumpfile package in Ubuntu:
  New

Bug description:
  [Impact] 
   * On Focal with an HWE (>=5.12) kernel, makedumpfile can sometimes fail with 
"__vtop4_x86_64: Can't get a valid pmd_pte."

   * makedumpfile falls back to cp for the dump, resulting in extremely
  large vmcores. This can impact both collection and analysis due to
  lack of space for the resulting vmcore.

   * This is fixed in upstream commit present in versions 1.7.0 and 1.7.1:
  
https://github.com/makedumpfile/makedumpfile/commit/646456862df8926ba10dd7330abf3bf0f887e1b6

  commit 646456862df8926ba10dd7330abf3bf0f887e1b6
  Author: Kazuhito Hagio <k-hagio...@nec.com>
  Date:   Wed May 26 14:31:26 2021 +0900

      [PATCH] Increase SECTION_MAP_LAST_BIT to 5
      
      * Required for kernel 5.12
      
      Kernel commit 1f90a3477df3 ("mm: teach pfn_to_online_page() about
      ZONE_DEVICE section collisions") added a section flag
      (SECTION_TAINT_ZONE_DEVICE) and causes makedumpfile an error on
      some machines like this:
      
        __vtop4_x86_64: Can't get a valid pmd_pte.
        readmem: Can't convert a virtual address(ffffe2bdc2000000) to physical 
address.
        readmem: type_addr: 0, addr:ffffe2bdc2000000, size:32768
        __exclude_unnecessary_pages: Can't read the buffer of struct page.
        create_2nd_bitmap: Can't exclude unnecessary pages.
      
      Increase SECTION_MAP_LAST_BIT to 5 to fix this.  The bit had not
      been used until the change, so we can just increase the value.
      
      Signed-off-by: Kazuhito Hagio <k-hagio...@nec.com>

  [Test Plan]
   * Confirm that makedumpfile works as expected by triggering a kdump.

   * Confirm that the patched makedumpfile works as expected on a system
  known to experience the issue.

   * Confirm that the patched makedumpfile is able to work with a cp-
  generated known affected vmcore to compress it. The unpatched version
  fails.

  [Where problems could occur]

   * This change could adversely affect the collection/compression of
  vmcores during a kdump situation resulting in fallback to cp.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/makedumpfile/+bug/1970672/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to