Hello Mike,

On Thu, Jan 22, 2026 at 12:57:50PM +0200, Mike Rapoport wrote:
> > +/**
> > + * DOC: Kexec Metadata ABI
> > + *
> 
> It would be nice to link it from Documentation/ as well ;-)

Ack! I am planning something as:

        commit 90e098ca0d611b44594f08e50ba1cff3c932dd2b
        Author: Breno Leitao <[email protected]>
        Date:   Thu Jan 22 03:47:23 2026 -0800

        kho: document kexec-metadata tracking feature
        
        Add documentation for the kexec-metadata feature that tracks the
        previous kernel version and kexec boot count across kexec reboots.
        This helps diagnose bugs that only reproduce when kexecing from
        specific kernel versions.
        
        Suggested-by: Mike Rapoport <[email protected]>
        Signed-off-by: Breno Leitao <[email protected]>

        diff --git a/Documentation/admin-guide/mm/kho.rst 
b/Documentation/admin-guide/mm/kho.rst
        index 6dc18ed4b8861..1faf2c3ba4620 100644
        --- a/Documentation/admin-guide/mm/kho.rst
        +++ b/Documentation/admin-guide/mm/kho.rst
        @@ -113,3 +113,42 @@ stabilized.
        ``/sys/kernel/debug/kho/in/sub_fdts/``
        Similar to ``kho/out/sub_fdts/``, but contains sub FDT blobs
        of KHO producers passed from the old kernel.
        +
        +Kexec Metadata
        +==============
        +
        +KHO automatically tracks metadata about the kexec chain, passing 
information
        +about the previous kernel to the next kernel. This feature helps 
diagnose
        +bugs that only reproduce when kexecing from specific kernel versions.
        +
        +On each KHO kexec, the kernel logs the previous kernel's version and 
the
        +number of kexec reboots since the last cold boot::
        +
        +    [    0.000000] KHO: exec from: 6.19.0-rc4-next-20260107 (count 1)
        +
        +The metadata includes:
        +
        +``previous_release``
        +    The kernel version string (from ``uname -r``) of the kernel that
        +    initiated the kexec.
        +
        +``kexec_count``
        +    The number of kexec boots since the last cold boot. On cold boot,
        +    this counter starts at 0 and increments with each kexec. This helps
        +    identify issues that only manifest after multiple consecutive kexec
        +    reboots.
        +
        +Use Cases
        +---------
        +
        +This metadata is particularly useful for debugging kexec transition 
bugs,
        +where a buggy kernel kexecs into a new kernel and the bug manifests 
only
        +in the second kernel. Examples of such bugs include:
        +
        +- Memory corruption from the previous kernel affecting the new kernel
        +- Incorrect hardware state left by the previous kernel
        +- Firmware/ACPI state issues that only appear in kexec scenarios
        +
        +At scale, correlating crashes to the previous kernel version enables
        +faster root cause analysis when issues only occur in specific kernel
        +transition scenarios.


> > diff --git a/kernel/liveupdate/kexec_handover.c 
> > b/kernel/liveupdate/kexec_handover.c
> 
> ...
> 
> >  static __init int kho_init(void)
> >  {
> >     const void *fdt = kho_get_fdt();
> > @@ -1357,6 +1413,15 @@ static __init int kho_init(void)
> >     if (err)
> >             goto err_free_fdt;
> >  
> > +   if (fdt)
> > +           kho_process_kexec_metadata();
> 
> Can't we move it into the existing if (fdt) below?

Unfortunately, that won't work due to a data dependency between the two
functions.

kho_process_kexec_metadata() reads from the FDT subtree and populates kho_in:

Basically:

        kho_in.kexec_count = metadata->kexec_count;

While kho_populate_kexec_metadata() increments metadata->kexec_count:

          /* kho_in.kexec_count is set to 0 on cold boot */
          metadata->kexec_count = kho_in.kexec_count + 1;

If kho_process_kexec_metadata() is moved after kho_populate_kexec_metadata(),
the count would always increment from 0 to 1, ignoring whatever was passed in
the FDT.

Restructuring to call kho_in_debugfs_init() earlier also doesn't work:


        if (fdt) {
                kho_in_debugfs_init(&kho_in.dbg, fdt);
                kho_process_kexec_metadata();
                return 0;
        }

        /* Populate kexec metadata for the possible next kexec */
        err = kho_populate_kexec_metadata();
        if (err)
                  pr_warn("failed to initialize kexec-metadata subtree: %d\n",
                          err);

This would return early without populating the kexec metadata for the next
kexec, breaking the chain on KHO boots.

Please let me know if I am missing any other option.

> > +
> > +   /* Populate kexec metadata for the possible next kexec */
> > +   err = kho_populate_kexec_metadata();
> > +   if (err)
> > +           pr_warn("failed to initialize kexec-metadata subtree: %d\n",
> > +                   err);
> 
> Please follow if (err) goto err_ pattern.
> 
> kho_populate_kexec_metadata() failure essentially means that we failed to
> allocate memory. This shouldn't happen that early in boot, but if it did,
> then something is utterly wrong.

Ack!

Thanks for the review,
--breno

Reply via email to