On 12/05/2024 3:48 pm, Andrew Cooper wrote:
> On 12/05/2024 3:16 am, Marek Marczykowski-Górecki wrote:
>> Hi,
>>
>> I've got a report[1] that after some update Linux HVM fails to start with the
>> error as in the subject. It looks to be caused by some change between
>> Xen 4.17.3 and 4.17.4. Here the failure is on Linux 6.6.25 (both dom0
>> and domU), but the 6.1.62 that worked with older Xen before, now fails
>> too. The full error (logged via earlyprintk=xen) is:
>>
>>     [    0.009500] Using GB pages for direct mapping
>>     PANIC: early exception 0x00 IP 10:ffffffffb01c32e2 error 0 cr2 
>> 0xffffa08649801000
>>     [    0.009606] CPU: 0 PID: 0 Comm: swapper Not tainted 
>> 6.6.25-1.qubes.fc37.x86_64 #1
>>     [    0.009665] Hardware name: Xen HVM domU, BIOS 4.17.4 04/26/2024
>>     [    0.009710] RIP: 0010:clear_page_orig+0x12/0x40
>>     [    0.009766] Code: 84 00 00 00 00 00 66 90 90 90 90 90 90 90 90 90 90 
>> 90 90 90 90 90 90 90 f3 0f 1e fa 31 c0 b9 40 00 00 00 0f 1f 44 00 00 ff c9 
>> <48> 89 07 48 89 47 08 48 89 47 10 48 89 47 18 48 89 47 20 48 89 47
>>     [    0.009862] RSP: 0000:ffffffffb0e03d58 EFLAGS: 00010016 ORIG_RAX: 
>> 0000000000000000
>>     [    0.009915] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 
>> 000000000000003f
>>     [    0.009967] RDX: 0000000000009801 RSI: 0000000000000000 RDI: 
>> ffffa08649801000
>>     [    0.010015] RBP: 0000000000000001 R08: 0000000000000001 R09: 
>> 6b7f283562d74b16
>>     [    0.010063] R10: 0000000000000000 R11: 0000000000000000 R12: 
>> 0000000000000001
>>     [    0.010112] R13: 0000000000000000 R14: ffffffffb0e22a08 R15: 
>> ffffa08640000000
>>     [    0.010161] FS:  0000000000000000(0000) GS:ffffffffb16ea000(0000) 
>> knlGS:0000000000000000
>>     [    0.010214] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>     [    0.010257] CR2: ffffa08649801000 CR3: 0000000008e80000 CR4: 
>> 00000000000000b0
>>     [    0.010310] Call Trace:
>>     [    0.010341]  <TASK>
>>     [    0.010372]  ? early_fixup_exception+0xf7/0x190
>>     [    0.010416]  ? early_idt_handler_common+0x2f/0x3a
>>     [    0.010460]  ? clear_page_orig+0x12/0x40
>>     [    0.010501]  ? alloc_low_pages+0xeb/0x150
>>     [    0.010541]  ? __kernel_physical_mapping_init+0x1d2/0x630
>>     [    0.010588]  ? init_memory_mapping+0x83/0x160
>>     [    0.010631]  ? init_mem_mapping+0x9a/0x460
>>     [    0.010669]  ? memblock_reserve+0x6d/0xf0
>>     [    0.010709]  ? setup_arch+0x796/0xf90
>>     [    0.010748]  ? start_kernel+0x63/0x420
>>     [    0.010787]  ? x86_64_start_reservations+0x18/0x30
>>     [    0.010828]  ? x86_64_start_kernel+0x96/0xa0
>>     [    0.010868]  ? secondary_startup_64_no_verify+0x18f/0x19b
>>     [    0.010918]  </TASK>
>>
>> I'm pretty sure the exception 0 is misleading here, I don't see how it
>> could be #DE.
>>
>> More logs (including full hypervisor log) are attached to the linked
>> issue.
>>
>> This is on HP 240 g7, and my educated guess is it's Intel Celeron N4020
>> CPU. I cannot reproduce the issue on different hardware.
>>
>> PVH domains seems to work.
>>
>> Any ideas what could have happened here?
> Yes.
>
> Revert the microcode back to revision 0x38 for now.

The regression is fixed in revision 0x42.

~Andrew

Reply via email to