Thanks Kellen and Cascardo.

Kellen, nice that you're willing to work on that - this is a long
standing problem and that work would be definitely appreciated by the
Ubuntu community, be it free users or the Ubuntu Advantage customers!

Cascardo, about your two risks:

(a) Partially agree with that. I agree with the part of testing,
definietly this is the big chunk of work here. But I disagree with the
retrocompatibility claim: of course we need to enforce that, but it's
not that difficult in the LTS->LTS+1 model. See, we have a small number
of HWE kernel per LTS release, I guess 4 or 5 correct? We need to be
sure the makedumpfile updates are compatible with them, and that's it.

If I'm talking Focal and some makedumpfile update (unintentionally) breaks 
dumps for kernel 4.19 or prior, why should we care if that's an unsupported 
scenario?
IMHO it's much better to ensure that every HWE kernel receives a proper 
functional makedumpfile update, instead of an overly cautious attitude with 
older/unsupported kernels.

(b) I agree here, but I guess the effort of SRU exception/ LTS->LTS+1
model will only make it easier. Imagine if when Ubuntu version X is
released the upstream makedumpfile is not handling well the recent
kernel version used by X - so we could either fix (or report) the
makedumpfile issue quickly (especially due to part (a) above, the
improved testing). Then, once it's fixed either by Canonical or
community, this could quickly be integrated through a fast process, a
version bump for makedumpfile for example.

In the end, I think testing is the key word here - the more serious and
thorough tests makedumpfile has, the more confidence in such model we'd
have. But hopefully with Kellen's effort this stops preventing a more
proactive approach with makedumpfile from happening, by updating it
before users report bugs (which has been happening since forever for
this package).

Cheers!

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to makedumpfile in Ubuntu.
https://bugs.launchpad.net/bugs/1970672

Title:
  makedumpfile falls back to cp with "__vtop4_x86_64: Can't get a valid
  pmd_pte."

Status in makedumpfile package in Ubuntu:
  New

Bug description:
  [Impact] 
   * On Focal with an HWE (>=5.12) kernel, makedumpfile can sometimes fail with 
"__vtop4_x86_64: Can't get a valid pmd_pte."

   * makedumpfile falls back to cp for the dump, resulting in extremely
  large vmcores. This can impact both collection and analysis due to
  lack of space for the resulting vmcore.

   * This is fixed in upstream commit present in versions 1.7.0 and 1.7.1:
  
https://github.com/makedumpfile/makedumpfile/commit/646456862df8926ba10dd7330abf3bf0f887e1b6

  commit 646456862df8926ba10dd7330abf3bf0f887e1b6
  Author: Kazuhito Hagio <k-hagio...@nec.com>
  Date:   Wed May 26 14:31:26 2021 +0900

      [PATCH] Increase SECTION_MAP_LAST_BIT to 5
      
      * Required for kernel 5.12
      
      Kernel commit 1f90a3477df3 ("mm: teach pfn_to_online_page() about
      ZONE_DEVICE section collisions") added a section flag
      (SECTION_TAINT_ZONE_DEVICE) and causes makedumpfile an error on
      some machines like this:
      
        __vtop4_x86_64: Can't get a valid pmd_pte.
        readmem: Can't convert a virtual address(ffffe2bdc2000000) to physical 
address.
        readmem: type_addr: 0, addr:ffffe2bdc2000000, size:32768
        __exclude_unnecessary_pages: Can't read the buffer of struct page.
        create_2nd_bitmap: Can't exclude unnecessary pages.
      
      Increase SECTION_MAP_LAST_BIT to 5 to fix this.  The bit had not
      been used until the change, so we can just increase the value.
      
      Signed-off-by: Kazuhito Hagio <k-hagio...@nec.com>

  [Test Plan]
   * Confirm that makedumpfile works as expected by triggering a kdump.

   * Confirm that the patched makedumpfile works as expected on a system
  known to experience the issue.

   * Confirm that the patched makedumpfile is able to work with a cp-
  generated known affected vmcore to compress it. The unpatched version
  fails.

  [Where problems could occur]

   * This change could adversely affect the collection/compression of
  vmcores during a kdump situation resulting in fallback to cp.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/makedumpfile/+bug/1970672/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to