From: Michael Kelley <[email protected]> Sent: Thursday, January 22, 2026 10:39 PM > > From: Matthew Ruffell <[email protected]> Sent: Thursday, January > 22, 2026 9:39 PM > > > > Hi Michael, > > > > > > I wonder if commit a41e0ab394e4 broke the initialization of screen_info > > > > in the > > > > kdump kernel. Or perhaps there is now a rev-lock between the kernel > > > > with this > > > > commit and a new version of the user space kexec command. > > > > a41e0ab394e4 isn't a mainline commit. Can you please mention the commit > > subject > > so I can have a read. > > It's this patch: > > https://lore.kernel.org/lkml/[email protected]/ > > which is in linux-next, but not yet in mainline. Since you are dealing with > older > kernels, it's not the culprit. > > > > > > > There's a parameter to the kexec() command that governs whether it uses > > > > the > > > > kexec_file_load() system call or the kexec_load() system call. > > > > I wonder if that parameter makes a difference in the problem described > > > > for this > > > > patch. > > > > Yes, it does indeed make a difference. I have been debugging this the past > > few > > days, and my colleague Melissa noticed that the problem reproduces when > > secure > > boot is disabled, but it does not reproduce when secure boot is enabled. > > Additionally, it reproduces on jammy, but not noble. It turns out that > > kexec-tools on jammy defaults to kexec_load() when secure boot is disabled, > > and when enabled, it instead uses kexec_file_load(). On noble, it defaults > > to > > first trying kexec_file_load() before falling back to kexec_load(), so the > > issue does not reproduce. > > This is good info, and definitely a clue. So to be clear, the problem repros > only when kexec_load() is used. With kexec_file_load(), it does not repro. Is > that > right? I saw a similar distinction when working on commit 304386373007, > though in the opposite direction! > > > > > > > > /* > > > > > * Set up a region of MMIO space to use for accessing > > > > > configuration > > > > > - * space. > > > > > + * space. Use the high MMIO range to not conflict with the > > > > > hyperv_drm > > > > > + * driver (which normally gets MMIO from the low MMIO range) in > > > > > the > > > > > + * kdump kernel of a Gen2 VM, which fails to reserve the > > > > > framebuffer > > > > > + * MMIO range in vmbus_reserve_fb() due to screen_info.lfb_base > > > > > being > > > > > + * zero in the kdump kernel. > > > > > */ > > > > > - ret = vmbus_allocate_mmio(&hbus->mem_config, hbus->hdev, 0, -1, > > > > > + ret = vmbus_allocate_mmio(&hbus->mem_config, hbus->hdev, SZ_4G, > > > > > -1, > > > > > PCI_CONFIG_MMIO_LENGTH, 0x1000, > > > > > false); > > > > > if (ret) > > > > > return ret; > > > > > -- > > > > Thank you for the patch Dexuan. > > > > This patch fixes the problem on Ubuntu 5.15, and 6.8 based kernels > > booting V6 instance types on Azure with Gen 2 images. > > Are you seeing the problem on x86/64 or arm64 instances in Azure? > "V6 instance types" could be either, I think, but I'm guessing you > are on x86/64. > > And just to confirm: are you seeing the problem with the > Hyper-V DRM driver, or the Hyper-V FB driver? This patch mentions > the DRM driver, so I assume that's the problematic config. > > > > > Tested-by: Matthew Ruffell <[email protected]> > > While this patch may solve the observed problem, I'm interested in > understanding the root cause of why vmbus_reserve_fb() is seeing > screen_info.lfb_base set to zero. It may be next week before I can > take a look, and I may need follow up with you on more details of the > scenario to reproduce the problem.
One more thought here: Is commit 96959283a58d relevant? The commit message describes a scenario where vmbus_reserve_fb() doesn't do anything because CONFIG_SYSFB is not set. Looking at the code for vmbus_reserve_fb(), it doing nothing might imply that screen_info.lfb_base is 0. But when CONFIG_SYSFB is not set, screen_info.lfb_base is just ignored, with the same result. This behavior started with the 6.7 kernel due to commit a07b50d80ab6. Note that commit 96959283a58d has a follow-on to correct a problem when CONFIG_EFI is not set. See commit 7b89a44b2e8c. If there's a reason to backport 96959283a58d, also get 7b89a44b2e8c. Michael
