From: Dexuan Cui <[email protected]> Sent: Thursday, April 2, 2026 10:10 AM
> 
> > From: Michael Kelley <[email protected]>
> > Sent: Wednesday, January 21, 2026 11:11 PM
> > ...
> > From: Dexuan Cui <[email protected]> Sent: Wednesday, January 21,
> > 2026 6:04 PM
> > >
> > > There has been a longstanding MMIO conflict between the pci_hyperv
> > > driver's config_window (see hv_allocate_config_window()) and the
> > > hyperv_drm (or hyperv_fb) driver (see hyperv_setup_vram()): typically
> > > both get MMIO from the low MMIO range below 4GB; this is not an issue
> > > in the normal kernel since the VMBus driver reserves the framebuffer
> > > MMIO in vmbus_reserve_fb(), so the drm driver's hyperv_setup_vram()
> > > can always get the reserved framebuffer MMIO; however, a Gen2 VM's
> > > kdump kernel fails to reserve the framebuffer MMIO in vmbus_reserve_fb()
> > >  because the screen_info.lfb_base is zero in the kdump kernel:
> > > the screen_info is not initialized at all in the kdump kernel, because the
> > > EFI stub code, which initializes screen_info, doesn't run in the case of 
> > > kdump.
> >
> > I don't think this is correct. Yes, the EFI stub doesn't run, but 
> > screen_info
> 
> Hi Michael, sorry for delaying the reply for so long! Now I think I should
> understand all the details.
> 
> My earlier statement "the screen_info is not initialized at all in the kdump
> kernel" is not correct on x86, but I believe it's correct on ARM64. Please see
> my explanation below.

Sadly, I must agree. It's surprising, because it affects kexec scenarios that
don't include Hyper-V. On arm64 bare metal, if you kexec to a kernel configured
to run the efifb frame buffer driver, the driver won't load.

> 
> > should be initialized in the kdump kernel by the code that loads the
> > kdump kernel into the reserved crash memory. See discussion in the commit
> > message for commit 304386373007.
> >
> > I wonder if commit a41e0ab394e4 broke the initialization of screen_info
> > in the kdump kernel. Or perhaps there is now a rev-lock between the kernel
> > with this commit and a new version of the user space kexec command.
> 
> The commit
> a41e0ab394e4 ("sysfb: Replace screen_info with sysfb_primary_display")
> should be unrelated here.

Agreed.

> 
> > There's a parameter to the kexec() command that governs whether it
> > uses the kexec_file_load() system call or the kexec_load() system call.
> > I wonder if that parameter makes a difference in the problem described
> > for this patch.
> >
> > I can't immediately remember if, when I was working on commit
> > 304386373007, I tested kdump in a Gen 2 VM with an NVMe OS disk to
> > ensure that MMIO space was properly allocated to the frame buffer
> > driver (either hyperv_fb or hyperv_drm). I'm thinking I did, but tomorrow
> > I'll check for any definitive notes on that.
> >
> > Michael

Evidently, I did not fully test an arm64 VM, or I would have seen that
screen_info was't being populated for the kdump kernel.

> 
> If vmbus_reserve_fb() in the kdump kernel fails to reserve the framebuffer
> MMIO range due to a Gen2 VM's screen_info.lfb_base being 0,  the MMIO
> conflict between hyperv_fb/hyperv_drm and hv_pci happens -- this is
> especially an issue if hv_pci is built-in and hyperv_fb/hyperv_drm is built
> as modules. vmbus_reserve_fb() should always succeed for a Gen1 VM, since
> it can always get the framebuffer MMIO base from the legacy PCI graphics
> device, so we only need to discuss Gen2 VMs here.

Agreed.

> 
> When kdump-tools loads the kdump kernel into memory, the tool can
> accept any of the 3 parameters (e.g. I got the below via "man kexec" in
> Ubuntu 24.04):
> 
>        -s (--kexec-file-syscall)
>               Specify that the new KEXEC_FILE_LOAD syscall should be used 
> exclusively.
> 
>        -c (--kexec-syscall)
>               Specify that the old KEXEC_LOAD syscall should be used 
> exclusively.
> 
>        -a (--kexec-syscall-auto)
>               Try the new KEXEC_FILE_LOAD syscall first and when it is not 
> supported or the kernel does not understand the supplied  im‐
>               age fall back to the old KEXEC_LOAD interface.
> 
>               There is no one single interface that always works, so this is 
> the default.
> 
>               KEXEC_FILE_LOAD is required on systems that use locked-down 
> secure boot to verify the kernel signature.  KEXEC_LOAD may be
>               also disabled in the kernel configuration.
> 
>               KEXEC_LOAD is required for some kernel image formats and on 
> architectures that do not implement KEXEC_FILE_LOAD.
> 
> If none of the parameters are specified, the default may be -c, or -s
> or -a, depending on the distro and the version in use.  We can run
>     strace -f kdump-config reload  2>&1 | egrep 'kexec_file_load|kexec_load' 
> to tell which syscall is being used.
> 
> Old distro versions seem to use KEXEC_LOAD by default, and new distro
> versions tend to use KEXEC_FILE_LOAD by default, especially when
> Secure Boot is enabled (e.g. see /usr/sbin/kdump-config: kdump_load()
> in Ubuntu).

Agreed. I think I had seen that previously.

> 
> In Ubuntu, we can explicitly specify one of the parameters in
> "/etc/default/kdump-tools", e.g. KDUMP_KEXEC_ARGS="-c -d".
> 
> The -d is for debugging. I found it very useful: when we run
> "kdump-config show" or "kdump-config reload", we get very useful
> debug info with -d.
> 
> On x86-64, with -c:
> The kdump-tools gets the framebuffer's MMIO base using
> ioctl(fd, FBIOGET_FSCREENINFO, ....): see the end of the email for
> an example program; kdump-tools then uses the KEXEC_LOAD syscall
> to set up the screen_info.lfb_base for the kdump kernel.

Thanks. While redoing some experiments yesterday, I found the
similar program that I had written a year ago to dump the ioctl results.

> 
> The function in kdump-tools that gets the framebuffer MMIO base
> is kexec/arch/i386/x86-linux-setup.c: setup_linux_vesafb():
> https://git.kernel.org/pub/scm/utils/kernel/kexec/kexec-
> tools.git/tree/kexec/arch/i386/x86-linux-setup.c?h=v2.0.32#n133
> 
> Unluckily, setup_linux_vesafb() only recognizes the vesafb
> driver in Linux kernel ("VESA VGA") and the efifb driver ("EFI VGA").
> It looks like normally arch_options.reuse_video_type is always 0.
> 
> This means the kdump kernel's screen_info.lfb_base is 0, if
> hyperv_fb or hyperv_drm loads. In the past,  for a Ubuntu kernel
> with CONFIG_FB_EFI=y, our workaround is blacklisting
> hyperv_fb or hyperv_drm, so /dev/fb0 is backed by efifb, and
> the screen_info.lfb_base is correctly set for kdump.

Hmmm. This worse than I thought for x86/x64. In fact, it means
a part of my commit message for 304386373007 is now wrong. I had
described everything as working when using the kexec_load() system
call because the FBIOGET_FSCREENINFO ioctl was returning a good
value for smem_start (at least with the hyperv_fb driver). But as you
point out further down, newer versions of the kexec user space program
are ignoring that smem_start value unless the driver is vesafb or efifb.

Was blacklisting hyperv_fb or hyperv_drm in the kdump kernel
a workaround we had promulgated in the past? My recollection
is vague. But no matter.

> 
> However, now CONFIG_FB_EFI is not set in recent Ubuntu kernels:
> $ egrep
> 'CONFIG_FB_EFI|CONFIG_SYSFB|CONFIG_SYSFB_SIMPLEFB|CONFIG_DRM_SIMPLEDR
> M|CONFIG_DRM_HYPERV' /boot/config-6.8.0-1051-azure
> CONFIG_SYSFB=y
> CONFIG_SYSFB_SIMPLEFB=y
> CONFIG_DRM_SIMPLEDRM=y
> CONFIG_DRM_HYPERV=m
> # CONFIG_FB_EFI is not set
> 
> So, with Ubuntu 22.04/24.04,  -c can't avoid the MMIO conflict
> for Gen2 x86-64 VMs now, even if we blacklist hyperv_fb/hyperv_drm.
> Note: Ubuntu 20.04 uses an old version of the kdump-tools, so
> the statement is different there (see the later discussion below).
> 
> hyperv_fb has been removed in the mainline kernel: see
> commit 40227f2efcfb ("fbdev: hyperv_fb: Remove hyperv_fb driver")
> so we no longer need to worry about it.
> 
> Even if we modify setup_linux_vesafb() to support  hyperv_drm,
> it still won't work, because the MMIO base is hidden by commit
> da6c7707caf3 ("fbdev: Add FBINFO_HIDE_SMEM_START flag")

Agreed.

> 
> On x86-64, with -s:
> The KEXEC_FILE_LOAD syscall sets the kdump kernel's
> screen_info.lfb_base in the kernel: see
> 
> "arch/x86/kernel/kexec-bzimage64.c"
>     bzImage64_load
>         setup_boot_parameters
>             memcpy(&params->screen_info, &screen_info, sizeof(struct 
> screen_info));
> 
> so, as long as the first kernel's hyperv_drm doesn't relocate the
> MMIO base, kdump should work fine; if the MMIO base is relocated,
> currently hyperv_drm doesn't update the screen_info.lfb_base,
> so the kdump's efifb driver and hv_pci driver won't work. Normally
> hyperv_drm doesn't relocate the MMIO base, unless the user
> specifies a very high resolution and the required MMIO size
> exceeds the default 8MB reserved by vmbus_reserve_fb() -- let's
> ignore that scenario for now.
> 

Agreed.

> 
> On AMR64, with -c:
> The kdump-tools doesn't even open /dev/fb0 (we can confirm this by using
> strace or bpftrace), so the kdump kernel's screen_info.lfb_base ia always 0.

Agreed.

> 
> On AMR64, with -s:
> "arch/arm64/kernel/kexec_image.c": image_load() doesn't set the
> params->screen_info, so the kdump kernel's screen_info.lfb_base ia always 0.

Agreed.

> 
> To recap, with a recent mainline kernel (or the linux-azure kernels) that
> has 304386373007, my observation on Ubuntu 22.04 and 24.04 is:
>     on x86-64, -c fails, but -s works.
>     on ARM64, -c fails, and -s also fails.
> 
> Note: the kdump-tools v2.0.18 in Ubuntu 20.04 doesn't have this commit:
> https://git.kernel.org/pub/scm/utils/kernel/kexec/kexec-
> tools.git/commit/?id=fb5a8792e6e4ee7de7ae3e06d193ea5beaaececc
> (Note the "return 0;" in setup_linux_vesafb())
> so, on x86-64, -c also works in Ubuntu 20.04, if hyperv_fb is used
> (-c still doesn't work if hyperv_drm is used due to da6c7707caf3).

Ah. That explains why I thought x86/x64 kdump was working with
hyperv_fb when working on commit 304386373007. I was testing with
kexec user space utility v2.0.18, which*does* propagate smem_start
from the ioctl to the loaded kdump image.

> 
> With this patch
> "PCI: hv: Allocate MMIO from above 4GB for the config window",
> both -c  and -s work on x86-64 and ARM64 due to no MMIO conflict,
> as long as there are no 32-bit PCI BARs (which should be true on
> Azure and on modern hosts.)
> 
> With the patch, even if hyperv_drm relocates the framebuffer MMO
> base, there would still be no MMIO conflict because typically hyperv_drm
> gets its MMIO from below 4GB: it seems like vmbus_walk_resources()
> always finds the low MMIO range first and adds it to the beginning of the
> MMIO resources "hyperv_mmio", so presumably hyperv_drm would
> get MMIO from the low MMIO range.
> 
> I'll update the commit message, add Matthew's and Krister's
> Tested-by's and post v2.

See my comments on v2 of your patch.  I have a thought for a
slightly different approach to solve the problem.

Michael

> 
> Thanks,
> Dexuan

Reply via email to