On 1/19/26 20:01, Andrew Cooper wrote:
> On 19/01/2026 10:34 am, Julian Vetter wrote:
>> On 1/15/26 4:50 PM, Andrew Cooper wrote:
>>> On 15/01/2026 3:17 pm, Julian Vetter wrote:
>>>> +{
>>>> + uint64_t misc_enable;
>>>> + uint32_t eax, ebx, ecx, edx;
>>>> +
>>>> + if ( !boot_cpu_has(X86_FEATURE_NX) )
>>>> + {
>>>> + /* Intel: try to unhide NX by clearing XD_DISABLE */
>>>> + cpuid(0, &eax, &ebx, &ecx, &edx);
>>>> + if ( ebx == X86_VENDOR_INTEL_EBX &&
>>>> + ecx == X86_VENDOR_INTEL_ECX &&
>>>> + edx == X86_VENDOR_INTEL_EDX )
>>>> + {
>>>> + rdmsrl(MSR_IA32_MISC_ENABLE, misc_enable);
>>>> + if ( misc_enable & MSR_IA32_MISC_ENABLE_XD_DISABLE )
>>>> + {
>>>> + misc_enable &= ~MSR_IA32_MISC_ENABLE_XD_DISABLE;
>>>> + wrmsrl(MSR_IA32_MISC_ENABLE, misc_enable);
>>>> +
>>>> + /* Re-read CPUID after having cleared XD_DISABLE */
>>>> + boot_cpu_data.x86_capability[FEATURESET_e1d] =
>>>> cpuid_edx(0x80000001U);
>>>> +
>>>> + /* Adjust misc_enable_off for secondary startup and
>>>> wakeup code */
>>>> + bootsym(trampoline_misc_enable_off) |=
>>>> MSR_IA32_MISC_ENABLE_XD_DISABLE;
>>>> + printk(KERN_INFO "re-enabled NX (Execute Disable)
>>>> protection\n");
>>>> + }
>>>> + }
>>>> + /* AMD: nothing we can do - NX must be enabled in BIOS */
>>> The BIOS is only hiding the CPUID bit. It's not blocking the use of NX.
>> Yes, you're right.
>>> You want to do a wrmsr_safe() trying to set EFER.NXE, and if it
>>> succeeds, set the NX bit in MSR_K8_EXT_FEATURE_MASK to "unhide" it in
>>> regular CPUID. This is a little more tricky to arrange because it needs
>>> doing on each CPU, not just the BSP.
>> Ok, yes, I have modified the AMD side to use MSR_K8_EXT_FEATURE_MASK to
>> "unhide" it.
>
> Great. And contrary to the other thread, this really must modify the
> mask MSRs rather than use setup_force_cpu_cap(), because we still need
> it to be visible to PV guest kernels which can't see Xen's choice of
> setup_force_cpu_cap().
>
>>
>>>> + }
>>>> +
>>>> + /* Enable EFER.NXE only if NX is available */
>>>> + if ( boot_cpu_has(X86_FEATURE_NX) )
>>>> + {
>>>> + if ( !(read_efer() & EFER_NXE) )
>>>> + write_efer(read_efer() | EFER_NXE);
>>>> +
>>>> + /* Adjust trampoline_efer for secondary startup and wakeup code */
>>>> + bootsym(trampoline_efer) |= EFER_NXE;
>>>> + }
>>>> +
>>>> + if ( IS_ENABLED(CONFIG_REQUIRE_NX) && !boot_cpu_has(X86_FEATURE_NX) )
>>>> + panic("This build of Xen requires NX support\n");
>>>> +}
>>>> +
>>>> /* How much of the directmap is prebuilt at compile time. */
>>>> #define PREBUILT_MAP_LIMIT (1 << L2_PAGETABLE_SHIFT)
>>>>
>>>> @@ -1159,6 +1203,8 @@ void asmlinkage __init noreturn __start_xen(void)
>>>> rdmsrl(MSR_EFER, this_cpu(efer));
>>>> asm volatile ( "mov %%cr4,%0" : "=r" (info->cr4) );
>>>>
>>>> + nx_init();
>>>> +
>>>> /* Enable NMIs. Our loader (e.g. Tboot) may have left them
>>>> disabled. */
>>>> enable_nmis();
>>>>
>>> This is too early, as can be seen by the need to make a cpuid() call
>>> rather than using boot_cpu_data.
>>>
>>> The cleanup I wanted to do was to create/rework early_cpu_init() to get
>>> things in a better order, so the panic() could go at the end here. The
>>> current split we've got of early/regular CPU init was inherited from
>>> Linux and can be collapsed substantially.
>> I have tried to add the logic into the early_init_{intel,amd}()
>> functions. But it seems this is already too late in the boot chain. This
>> is why I put into an extra function which is called earlier. Because it
>> seems there are already pages with PAGE_NX being used on the way to
>> early_init_{intel,amd}(). Because when I put my code into
>> early_init_intel I get a fault and a reboot. What do you suggest?
>
> Have you got the backtrace available?
Yes. Here it is. Although I saw before when enabling
'CONFIG_MICROCODE_LOADING' it faults even earlier, somewhere in
'find_cpio_data()', but with the same EC = 0x0009 (Protection violation,
Reserved bit violation).
Xen 4.22-unstable
(XEN) Xen version 4.22-unstable (julian@work) (gcc (Debian 15.2.0-12)
15.2.0) debug=y Thu Jan 22 14:28:58 CET 2026
(XEN) Latest ChangeSet: Tue Jan 13 16:50:12 2026 +0100 git:ce886ef641
(XEN) build-id: 2e72a4b08fca3ae0f0ed9af0dd3a5de947a966d0
(XEN) CPU Vendor: Intel, Family 6 (0x6), Model 55 (0x37), Stepping 8
(raw 00030678)
(XEN) BSP microcode revision: 0x00000836
(XEN) Bootloader: GRUB 2.12
(XEN) Command line: dom0_mem=1232M,max:1232M watchdog ucode=scan
dom0_max_vcpus=1-1 com1=115200,8n1 console=com1
(XEN) Xen image load base address: 0xb5800000
(XEN) Video information:
(XEN) VGA is graphics mode 800x600, 32 bpp
(XEN) Disc information:
(XEN) Found 0 MBR signatures
(XEN) Found 1 EDD information structures
(XEN) EFI RAM map:
(XEN) [0000000000000000, 000000000003efff] (usable)
(XEN) [000000000003f000, 000000000003ffff] (ACPI NVS)
(XEN) [0000000000040000, 000000000009ffff] (usable)
(XEN) [0000000000100000, 000000001effffff] (usable)
(XEN) [000000001f000000, 000000001f0fffff] (reserved)
(XEN) [000000001f100000, 000000001fffffff] (usable)
(XEN) [0000000020000000, 00000000200fffff] (reserved)
(XEN) [0000000020100000, 00000000b9377fff] (usable)
(XEN) [00000000b9378000, 00000000b93a7fff] (reserved)
(XEN) [00000000b93a8000, 00000000b94bdfff] (usable)
(XEN) [00000000b94be000, 00000000b98d6fff] (ACPI NVS)
(XEN) [00000000b98d7000, 00000000b9bb0fff] (reserved)
(XEN) [00000000b9bb1000, 00000000b9bb1fff] (usable)
(XEN) [00000000b9bb2000, 00000000b9bf3fff] (reserved)
(XEN) [00000000b9bf4000, 00000000b9d6dfff] (usable)
(XEN) [00000000b9d6e000, 00000000b9ff9fff] (reserved)
(XEN) [00000000b9ffa000, 00000000b9ffffff] (usable)
(XEN) [00000000e00f8000, 00000000e00f8fff] (reserved)
(XEN) [00000000fed01000, 00000000fed01fff] (reserved)
(XEN) [00000000fed08000, 00000000fed08fff] (reserved)
(XEN) [00000000ffb00000, 00000000ffffffff] (reserved)
(XEN) [0000000100000000, 000000013fffffff] (usable)
(XEN) Early fatal page fault at e008:ffff82d0403b38e0
(cr2=0000000001100202, ec=0009)
(XEN) ----[ Xen-4.22-unstable x86_64 debug=y Not tainted ]----
(XEN) CPU: 0
(XEN) RIP: e008:[<ffff82d0403b38e0>] memcmp+0x20/0x46
(XEN) RFLAGS: 0000000000010002 CONTEXT: hypervisor
(XEN) rax: 0000000000000000 rbx: 0000000001100000 rcx: 0000000000000000
(XEN) rdx: 0000000000000004 rsi: ffff82d0404a0d23 rdi: 0000000001100202
(XEN) rbp: ffff82d040497d88 rsp: ffff82d040497d78 r8: 0000000000000016
(XEN) r9: ffff82d04061a180 r10: ffff82d04061a188 r11: 0000000000000010
(XEN) r12: 0000000001100000 r13: 0000000000000001 r14: ffff82d0404d2b80
(XEN) r15: ffff82d040462750 cr0: 0000000080050033 cr4: 00000000000000a0
(XEN) cr3: 00000000b5d0e000 cr2: 0000000001100202
(XEN) fsb: 0000000000000000 gsb: 0000000000000000 gss: 0000000000000000
(XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008
(XEN) Xen code around <ffff82d0403b38e0> (memcmp+0x20/0x46):
(XEN) 0f 1f 84 00 00 00 00 00 <0f> b6 04 0f 44 0f b6 04 0e 44 29 c0 75
13 48 83
(XEN) Xen stack trace from rsp=ffff82d040497d78:
(XEN) ffff82d040483f79 0000000000696630 ffff82d040497db0 ffff82d040483fd2
(XEN) 0000000000696630 ffff82d040200000 0000000000000001 ffff82d040497ef8
(XEN) ffff82d04047c4ac 0000000000000000 0000000000000000 0000000000000000
(XEN) ffff82d04062c6d8 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000140000 0000000000000000 0000000000000001
(XEN) 0000000000000000 0000000000000000 ffff82d040497f08 ffff82d0404d2b80
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000800000000 000000010000006e 0000000000000003
(XEN) 00000000000002f8 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000099f30ba0 0000000099feeda7 0000000000000000 ffff82d040497fff
(XEN) 00000000b9cf3920 ffff82d0402043e8 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000e01000000000 0000000000000000 0000000000000000
(XEN) 00000000000000a0 0000000000000000 0000000000000000 0000000000000000
(XEN) Xen call trace:
(XEN) [<ffff82d0403b38e0>] R memcmp+0x20/0x46
(XEN) [<ffff82d040483f79>] S arch/x86/bzimage.c#bzimage_check+0x2e/0x73
(XEN) [<ffff82d040483fd2>] F bzimage_headroom+0x14/0xa5
(XEN) [<ffff82d04047c4ac>] F __start_xen+0x908/0x2452
(XEN) [<ffff82d0402043e8>] F __high_start+0xb8/0xc0
(XEN)
(XEN) Pagetable walk from 0000000001100202:
(XEN) L4[0x000] = 00000000b5c9d063 ffffffffffffffff
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 0:
(XEN) FATAL TRAP: vec 14, #PF[0009] IN INTERRUPT CONTEXT
(XEN) ****************************************
>
> It's probably easiest if I prototype the split I'd like to see, and you
> integrate with that.
>
> ~Andrew
--
Julian Vetter | Vates Hypervisor & Kernel Developer
XCP-ng & Xen Orchestra - Vates solutions
web: https://vates.tech