On Mon, Mar 09, 2026 at 12:49:49PM +0530, Aditya Gupta wrote:
> Overview
> =========
>
> Implemented MPIPL (Memory Preserving IPL, aka fadump) on PowerNV machine
> in QEMU.
>
> Fadump is an alternative dump mechanism to kdump, in which we the firmware
> does a memory preserving boot, and the second/crashkernel is booted fresh
> like a normal system reset, instead of the crashed kernel loading the
> second/crashkernel in case of kdump.
>
> MPIPL in PowerNV, is similar to fadump in Pseries. The idea is same, memory
> preserving, where in PowerNV we are assisted by SBE (Self Boot Engine) &
> Hostboot, while in Pseries we are assisted by PHyp (Power Hypervisor)
>
> For implementing in baremetal/powernv QEMU, we need to export a
> "ibm,opal/dump" node in the device tree, to tell the kernel we support
> MPIPL
>
> Once kernel sees the support, and "fadump=on" is passed on commandline,
> kernel will register memory regions to preserve with Skiboot.
>
> Kernel sends these data using OPAL calls, after which skiboot/opal saves
> the memory region details to MDST and MDDT tables (S-source, D-destination)
>
> Then in the event of a kernel crash, the kernel initiates MPIPL with another
> OPAL code (opal_cec_reboot2), this request goes to Skiboot.
> Skiboot then triggers the "S0 Interrupt" to the SBE (Self Boot Engine),
> along with OPAL's relocated base address.
>
> SBE then stops all core clocks, and only does particular ISteps for a
> memory preserving boot.
>
> Then, hostboot comes up, and with help of the relocated base address, it
> accesses MDST & MDDT tables (S-source and D-destination), and preserves the
> memory regions according to the data in these tables.
> And after preserving, it writes the preserved memory region details to MDRT
> tables (R-Result), for the kernel to know where/whether a memory region is
> preserved.
>
> Both SBE's and hostboot responsiblities are implemented in the SBE code
> in QEMU.
>
> Then in the second kernel/crashkernel boot, OPAL passes the "mpipl-boot"
> property for the kernel to know that a dump is active, which kernel then
> exports in /proc/vmcore
>
> Testing
> ====================
>
> 1. Git tree for testing:
> https://gitlab.com/adi-g15-ibm/qemu/tree/fadump-powernv-v4
>
> 2. Gitlab pipeline: https://gitlab.com/adi-g15-ibm/qemu/-/pipelines/2372098708
>
> 3. Analysing generated vmcore:
>
> # ls -lh /proc/vmcore
> -r-------- 1 root root 4.5G Feb 25 07:33 /proc/vmcore
> # file /proc/vmcore
> /proc/vmcore: ELF 64-bit LSB core file, 64-bit PowerPC or cisco 7500,
> OpenPOWER ELF V2 ABI, version 1 (SYSV), SVR4-style
>
> # crash vmlinux vmcore
> ...
> KERNEL: vmlinux-38fec10eb60d-network
> DUMPFILE: vmcore-powernv-25feb26
> CPUS: 4
> DATE: Thu Jan 1 05:30:00 IST 1970
> UPTIME: 00:05:23
> LOAD AVERAGE: 0.12, 0.08, 0.03
> TASKS: 101
> NODENAME: buildroot
> RELEASE: 6.14.0
> VERSION: #1 SMP Thu Apr 3 08:06:13 CDT 2025
> MACHINE: ppc64le (1000 Mhz)
> MEMORY: 6 GB
> PANIC: "Kernel panic - not syncing: sysrq triggered crash"
> PID: 257
> COMMAND: "sh"
> TASK: c000000008066600 [THREAD_INFO: c000000008066600]
> CPU: 2
> STATE: TASK_RUNNING (PANIC)
>
> crash> # ps and kmem -i works
>
Hi Aditya,
I was able to get the fadump working on buildroot images, with this.
Thanks for this feature.
Welcome to Buildroot
buildroot login: root
# dmesg | grep fadump
[ 0.000000][ T0] opal fadump: Kernel metadata addr: 653902a8
[ 0.000000][ T0] fadump: Reserved 768MB of memory at
0x00000035390000 (System RAM: 8192MB)
[ 0.000000][ T0] fadump: Initialized [0x36000000, 752MB] cma area
from [0x35390000, 768MB] bytes of memory reserved for firmware-assisted dump
[ 0.000000][ T0] Kernel command line: console=hvc0 rootwait
root=/dev/nvme0n1 fadump=on
[ 0.377005][ T1] opal fadump: Registration is successful!
# uname -r
6.19.6
<snip/>
Welcome to Buildroot
buildroot login: root
# ls /proc/vmcore
/proc/vmcore
# ls -alh /proc/vmcore
-r-------- 1 root root 4.3G Mar 8 17:08 /proc/vmcore
Tested-by: Shivang Upadhyay <[email protected]>
~Shivang.