https://bugzilla.kernel.org/show_bug.cgi?id=220749
--- Comment #26 from [email protected] ([email protected]) --- Hi Mario, Quick follow-up on the two points you raised — why I tested mem_encrypt=off, and what happens with amd_iommu=off in the different combinations. 1. Why I originally used mem_encrypt=off The only reason I experimented with mem_encrypt=off was that some of the early boot failures looked similar to what I’ve seen documented when SME/TSME is involved on unsupported or misconfigured systems. Since this HP firmware exposes no explicit SME/TSME/“Memory Guard” toggle anywhere in the setup utility, I tried mem_encrypt=off as a way to rule out any latent memory-encryption interaction contributing to the hangs. As you pointed out, on this platform it doesn’t actually change behavior in a meaningful way, so I’ve dropped it from my current tests. It was purely a diagnostic attempt to eliminate one possible class of problems. When I originally sent you the update that included that flag, it was at the end of a long day of repeated initramfs drops, and I was just relieved to reach userspace with something other than acpi=off and nomodeset, so I carried it forward out of caution. After re-testing without it and comparing logs with a clearer head, I realized it wasn’t actually contributing anything, so I’ve removed it from further testing. 2. Results with amd_iommu=off Per your suggestion I tried the two combinations: amd_iommu=off (by itself) This reproduces the (unfortunately) familar freeze, occruing a few seconds after “loading initial ramdisk” on the splash screen. No initramfs prompt or rescue shell appears; the keyboard LEDs go out and I have to hard power-off. pci=noacpi amd_iommu=off This one actually boots successfully into Debian and reaches userspace. Root filesystem is found and the system seems stable under normal use. So at this point I have two “working” configurations: this one, and the earlier pci=noacpi iommu=pt setup. Next step on my side is to get a serial console set up so I can capture early boot logs for the failing amd_iommu=off case (and the non-pt IOMMU path more generally) and send those along. Thanks again for sticking with this — I appreciate the help. Sent with Proton Mail secure email. On Thursday, November 20th, 2025 at 7:44 AM, [email protected] <[email protected]> wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=220749 > > --- Comment #25 from Mario Limonciello (AMD) ([email protected]) --- > > > If serial console output would still be useful after this round of testing, > I > > can pick up a PCIe serial card and try to capture logs that way. For now, > the > > information below is from dmesg/journalctl on successful boots. > > > At this point we're still looking for a needle in a haystack and guessing. No > promises of course; but a serial console might get us a better hint at where > things are going awry. > > > the system also fails to reach userspace, but in this case it drops to the > > initramfs prompt with an error that it cannot find the root filesystem and > > offers a rescue shell. This behavior is consistent whenever `iommu=pt` is > > removed, and has occurred on each Debian kernel I’ve tested on this machine > > (e.g. 6.12.x, 6.16.8, and previously 6.18.x before I removed it). > > > OK this is really helpful to know actually. It points at an IOMMU issue that > it would be helpful to look at the logs for. You should be able to get a > working network stack at the initramfs prompt and use something like > netconsole > or ssh (where you'll have to include these in your initramfs) to be able to > send the full kernel log in this configuration to another machine. > > Instead of iommu=pt can you please try amd_iommu=off both with and with and > without pci=noacpi? > > > I’ve double-checked the firmware setup utility on this HP OmniDesk > M02-0310. > > There is no TSME / SME / “Memory Guard” / “Transparent Secure Memory > > Encryption” option exposed anywhere in the BIOS (I checked under Security, > > Advanced, and CPU-related menus). So there is nothing I can toggle for TSME > > on this system. > > > As your previous testing with memory encryption was pointless (where did this > come from?) then TSME will also be pointless. > > > Under the working combination (`pci=noacpi iommu=pt`), GPU acceleration > > appears to be working correctly: > > > OK, this is a much more usable system at least then. > > > [ 0.033598] AMD-Vi: ivrs, add hid:AMDI0020, uid:ID00, rdevid:0xa0 > > [ 0.033600] AMD-Vi: ivrs, add hid:AMDI0020, uid:ID01, rdevid:0xa0 > > [ 0.033603] AMD-Vi: ivrs, add hid:AMDI0020, uid:ID02, rdevid:0xa0 > > [ 0.033604] AMD-Vi: ivrs, add hid:AMDI0020, uid:ID03, rdevid:0xa0 > > [ 0.033605] AMD-Vi: Using global IVHD EFR:0x246577efa2254afa, EFR2:0x10 > > [ 0.033646] AMD-Vi: [Firmware Bug]: : No southbridge IOAPIC found > > [ 0.033650] AMD-Vi: Disabling interrupt remapping > > [ 0.033655] clocksource: tsc-early: mask: 0xfffffff > > > From your logs I noticed this while IOMMU is in pass through. You can see > there is definitely a firmware bug in the IVRS table entries that the kernel > tells you about. > > > https://github.com/torvalds/linux/blob/23cb64fb76257309e396ea4cec8396d4a1dbae68/drivers/iommu/amd/init.c#L3103 > > If you look at commit history you can see this message was added in place to > turn off interrupt remapping and try to work around the BIOS bug. > > > https://github.com/torvalds/linux/commit/c2ff5cf5294bcbd7fa50f7d860e90a66db7e5059 > > This is probably enough to /start/ your conversation with HP support about > fixing the firmware as it's tangible and definitely not a Linux problem. > > -- > You may reply to this email to add a comment. > > You are receiving this mail because: > You are on the CC list for the bug. > You reported the bug. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. _______________________________________________ acpi-bugzilla mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/acpi-bugzilla
