Hello Jian,

On 01/15/18 09:54, Jian J Wang wrote:
> If PcdDxeNxMemoryProtectionPolicy is set to enable protection for memory
> of EfiBootServicesCode, EfiConventionalMemory, the BIOS will hang at a page
> fault exception triggered by PiSmmCpuDxeSmm.
>
> The root cause is that PiSmmCpuDxeSmm will access default SMM RAM starting
> at 0x30000 which is marked as non-executable, but NX feature was not
> enabled during SMM initialization. Accessing memory which has invalid
> attributes set will cause page fault exception. This patch fixes it by
> checking NX capability in cpuid and enable NXE in EFER MSR if it's
> available.
>
> Cc: Jiewen Yao <jiewen....@intel.com>
> Cc: Ruiyu Ni <ruiyu...@intel.com>
> Cc: Eric Dong <eric.d...@intel.com>
> Cc: Laszlo Ersek <ler...@redhat.com>
> Contributed-under: TianoCore Contribution Agreement 1.1
> Signed-off-by: Jian J Wang <jian.j.w...@intel.com>
> ---
>  UefiCpuPkg/PiSmmCpuDxeSmm/Ia32/SmmInit.nasm | 14 ++++++++++++++
>  UefiCpuPkg/PiSmmCpuDxeSmm/X64/SmmInit.nasm  | 12 +++++++++++-
>  2 files changed, 25 insertions(+), 1 deletion(-)

This patch also breaks OVMF on KVM. However, the circumstances and the
symptom differ from those of the other regression that I reported under
patch 1/6 in this series [1] [2].

Namely, this affects the IA32 build, with SMM:

$ build -a IA32 -p OvmfPkg/OvmfPkgIa32.dsc -D SECURE_BOOT_ENABLE \
    -D SMM_REQUIRE -t GCC48 -b NOOPT -D HTTP_BOOT_ENABLE

The value of PcdDxeNxMemoryProtectionPolicy is zero in this build too.

The virtual CPU model that this OVMF build runs on does *not* support
NX. (Please refer to "OvmfPkg/README":

> === SMM support ===
>
> [...]
>
> * QEMU binary and options specific to 32-bit guests:
>
>   $ qemu-system-i386 -cpu coreduo,-nx \

)

The boot hangs at the following location:

> Loading SMM driver at 0x0007FFA2000 EntryPoint=0x0007FFA844D 
> PiSmmCpuDxeSmm.efi
> SMRR Base: 0x7F000000, SMRR Size: 0x1000000
> PcdCpuSmmCodeAccessCheckEnable = 1
> mAddressEncMask = 0x0
> SMRAM TileSize = 0x00002000 (0x00001000, 0x00001000)
> SMRAM SaveState Buffer (0x7FF94000, 0x0000E000)
> CPU[000]  APIC ID=0000  SMBASE=7FF8C000  SaveState=7FF9BC00  Size=00000400
> CPU[001]  APIC ID=0001  SMBASE=7FF8E000  SaveState=7FF9DC00  Size=00000400
> CPU[002]  APIC ID=0002  SMBASE=7FF90000  SaveState=7FF9FC00  Size=00000400
> CPU[003]  APIC ID=0003  SMBASE=7FF92000  SaveState=7FFA1C00  Size=00000400
> <--HANG-->

KVM trace (excerpt):

>  1 CPU-4989  [002] 16163.874223: kvm_enter_smm:        vcpu 1: entering SMM, 
> smbase 0x30000
>  2 CPU-4989  [002] 16163.874244: kvm_fpu:              load
>  3 CPU-4989  [002] 16163.874245: kvm_entry:            vcpu 1
>  4 CPU-4989  [002] 16163.874247: kvm_exit:             reason EPT_VIOLATION 
> rip 0x8000 info 184 0
>  5 CPU-4989  [002] 16163.874247: kvm_page_fault:       address 38000 
> error_code 184
>  6 CPU-4989  [002] 16163.874251: kvm_entry:            vcpu 1
>  7 CPU-4989  [002] 16163.874253: kvm_exit:             reason EPT_VIOLATION 
> rip 0x7ff864ba info 184 0
>  8 CPU-4989  [002] 16163.874253: kvm_page_fault:       address 7ffb64ba 
> error_code 184
>  9 CPU-4989  [002] 16163.874256: kvm_entry:            vcpu 1
> 10 CPU-4989  [002] 16163.874257: kvm_exit:             reason CPUID rip 
> 0x7ff864c0 info 0 0
> 11 CPU-4989  [002] 16163.874258: kvm_cpuid:            func 80000001 rax 6e8 
> rbx 0 rcx 0 rdx 0
> 12 CPU-4989  [002] 16163.874258: kvm_entry:            vcpu 1
> 13 CPU-4989  [002] 16163.874259: kvm_exit:             reason CR_ACCESS rip 
> 0x7ff864cb info 3 0
> 14 CPU-4989  [002] 16163.874260: kvm_cr:               cr_write 3 = 0x0
> 15 CPU-4989  [002] 16163.874261: kvm_entry:            vcpu 1
> 16 CPU-4989  [002] 16163.874262: kvm_exit:             reason CR_ACCESS rip 
> 0x7ff864db info 4 0
> 17 CPU-4989  [002] 16163.874263: kvm_cr:               cr_write 4 = 0x640
> 18 CPU-4989  [002] 16163.874273: kvm_entry:            vcpu 1
> 19 CPU-4989  [002] 16163.874274: kvm_exit:             reason MSR_READ rip 
> 0x7ff864e4 info 0 0
> 20 CPU-4989  [002] 16163.874275: kvm_msr:              msr_read c0000080 = 0x0
> 21 CPU-4989  [002] 16163.874276: kvm_entry:            vcpu 1
> 22 CPU-4989  [002] 16163.874277: kvm_exit:             reason EPT_VIOLATION 
> rip 0x64f4 info 184 0
> 23 CPU-4989  [002] 16163.874277: kvm_page_fault:       address 364f4 
> error_code 184
> 24 CPU-4989  [002] 16163.874282: kvm_entry:            vcpu 1
> 25 CPU-4989  [002] 16163.874283: kvm_exit:             reason EPT_VIOLATION 
> rip 0x64f4 info 183 0
> 26 CPU-4989  [002] 16163.874283: kvm_page_fault:       address f000 
> error_code 183
> 27 CPU-4989  [002] 16163.874288: kvm_entry:            vcpu 1
> 28 CPU-4989  [002] 16163.874292: kvm_exit:             reason EPT_VIOLATION 
> rip 0x7000 info 184 0
> 29 CPU-4989  [002] 16163.874293: kvm_page_fault:       address 37000 
> error_code 184
> 30 CPU-4989  [002] 16163.874295: kvm_entry:            vcpu 1
> 31 CPU-4989  [002] 16163.874300: kvm_exit:             reason CPUID rip 
> 0x7ff864c0 info 0 0
> 32 CPU-4989  [002] 16163.874301: kvm_cpuid:            func 80000001 rax 6e8 
> rbx 0 rcx 0 rdx 0
> 33 CPU-4989  [002] 16163.874301: kvm_entry:            vcpu 1
> 34 CPU-4989  [002] 16163.874302: kvm_exit:             reason CR_ACCESS rip 
> 0x7ff864cb info 3 0
> 35 CPU-4989  [002] 16163.874303: kvm_cr:               cr_write 3 = 0x0
> 36 CPU-4989  [002] 16163.874304: kvm_entry:            vcpu 1

After this point, lines 19-36 repeat infinitely.

The above trace corresponds to the following, from
"UefiCpuPkg/PiSmmCpuDxeSmm/Ia32/SmmInit.nasm":

> 246cd9085f806 (Liming Gao  2016-06-14 16:29:40 +0800  43) global 
> ASM_PFX(SmmStartup)
> 246cd9085f806 (Liming Gao  2016-06-14 16:29:40 +0800  44) ASM_PFX(SmmStartup):
> d4d87596c11d6 (Jian J Wang 2018-01-15 10:16:26 +0800  45)     DB      0x66
> d4d87596c11d6 (Jian J Wang 2018-01-15 10:16:26 +0800  46)     mov     eax, 
> 0x80000001             ; read capability
> d4d87596c11d6 (Jian J Wang 2018-01-15 10:16:26 +0800  47)     cpuid
> d4d87596c11d6 (Jian J Wang 2018-01-15 10:16:26 +0800  48)     DB      0x66
> d4d87596c11d6 (Jian J Wang 2018-01-15 10:16:26 +0800  49)     mov     ebx, 
> edx                    ; rdmsr will change edx. keep it in ebx.
> 246cd9085f806 (Liming Gao  2016-06-14 16:29:40 +0800  50)     DB      0x66, 
> 0xb8
> 246cd9085f806 (Liming Gao  2016-06-14 16:29:40 +0800  51) ASM_PFX(gSmmCr3): 
> DD 0
> 246cd9085f806 (Liming Gao  2016-06-14 16:29:40 +0800  52)     mov     cr3, eax
> 246cd9085f806 (Liming Gao  2016-06-14 16:29:40 +0800  53)     DB      0x67, 
> 0x66
> 246cd9085f806 (Liming Gao  2016-06-14 16:29:40 +0800  54)     lgdt    [cs:ebp 
> + (ASM_PFX(gcSmiInitGdtr) - ASM_PFX(SmmStartup))]
> 246cd9085f806 (Liming Gao  2016-06-14 16:29:40 +0800  55)     DB      0x66, 
> 0xb8
> 246cd9085f806 (Liming Gao  2016-06-14 16:29:40 +0800  56) ASM_PFX(gSmmCr4): 
> DD 0
> 246cd9085f806 (Liming Gao  2016-06-14 16:29:40 +0800  57)     mov     cr4, eax
> d4d87596c11d6 (Jian J Wang 2018-01-15 10:16:26 +0800  58)     DB      0x66
> d4d87596c11d6 (Jian J Wang 2018-01-15 10:16:26 +0800  59)     mov     ecx, 
> 0xc0000080             ; IA32_EFER MSR
> d4d87596c11d6 (Jian J Wang 2018-01-15 10:16:26 +0800  60)     rdmsr
> d4d87596c11d6 (Jian J Wang 2018-01-15 10:16:26 +0800  61)     DB      0x66
> d4d87596c11d6 (Jian J Wang 2018-01-15 10:16:26 +0800  62)     test    ebx, 
> BIT20                  ; check NXE capability
> d4d87596c11d6 (Jian J Wang 2018-01-15 10:16:26 +0800  63)     jz      .1
> d4d87596c11d6 (Jian J Wang 2018-01-15 10:16:26 +0800  64)     or      ah, 
> BIT3                    ; set NXE bit
> d4d87596c11d6 (Jian J Wang 2018-01-15 10:16:26 +0800  65)     wrmsr
> d4d87596c11d6 (Jian J Wang 2018-01-15 10:16:26 +0800  66) .1:
> 246cd9085f806 (Liming Gao  2016-06-14 16:29:40 +0800  67)     DB      0x66, 
> 0xb8
> 246cd9085f806 (Liming Gao  2016-06-14 16:29:40 +0800  68) ASM_PFX(gSmmCr0): 
> DD 0
> 246cd9085f806 (Liming Gao  2016-06-14 16:29:40 +0800  69)     DB      0xbf, 
> PROTECT_MODE_DS, 0    ; mov di, PROTECT_MODE_DS
> 246cd9085f806 (Liming Gao  2016-06-14 16:29:40 +0800  70)     mov     cr0, eax
> 246cd9085f806 (Liming Gao  2016-06-14 16:29:40 +0800  71)     DB      0x66, 
> 0xea                   ; jmp far [ptr48]

The WRMSR is never reached (which is fine) but the CR0 write is also not
reached, ever. Instead, we seem to be stuck in SMM forever, looping back
to SmmStartup.

Here's the bisection log:

> git bisect start
> # good: [018432f0ce1b42541977f61f9c7607257a4bf43a] MdeModulePkg/Ip4Dxe: Add 
> an independent timer for reconfig checking
> git bisect good 018432f0ce1b42541977f61f9c7607257a4bf43a
> # bad: [06c1f423e17fe5ddef824d688d21c83730238ba6] BeagleBoardPkg: reroute 
> Firmware Vendor Pcd to MdeModulePkg
> git bisect bad 06c1f423e17fe5ddef824d688d21c83730238ba6
> # bad: [8ab0bd2397c9d3922e0c7dbb1aa6f7e08799079f] MdePkg/DMAR: Add the 
> definition for DMA_CTRL_PLATFORM_OPT_IN_FLAG bit
> git bisect bad 8ab0bd2397c9d3922e0c7dbb1aa6f7e08799079f
> # good: [24a105a7d8b4b8312743cf265f869dc049b7ff92] BaseTools: Disable warning 
> varargs in XCODE5 align to CLANG38
> git bisect good 24a105a7d8b4b8312743cf265f869dc049b7ff92
> # good: [b2725f57c7a1e6feeb176f1563a4f1a8c2eb6c6f] IntelSiliconPkg 
> IntelVTdPmrPei: Get high top by host address width
> git bisect good b2725f57c7a1e6feeb176f1563a4f1a8c2eb6c6f
> # good: [4f10654e04601fe67a750c9b5a4242efd4141569] UefiCpuPkg/CpuDxe: fix 
> SetMemoryAttributes issue in 32-bit mode
> git bisect good 4f10654e04601fe67a750c9b5a4242efd4141569
> # good: [fbe2c4b9be98a5c2b9c1f6976f51e2456467e752] UefiCpuPkg/CpuDxe: clear 
> NX attr for page directory
> git bisect good fbe2c4b9be98a5c2b9c1f6976f51e2456467e752
> # bad: [94c0129d244f91fa0a7b122414872da49a35f853] MdeModulePkg/PiSmmCore: 
> remove NX attr for SMM RAM
> git bisect bad 94c0129d244f91fa0a7b122414872da49a35f853
> # bad: [d4d87596c11d6e3f8220b6d9677797c802af3a33] UefiCpuPkg/PiSmmCpuDxeSmm: 
> Enable NXE if it's supported
> git bisect bad d4d87596c11d6e3f8220b6d9677797c802af3a33
> # first bad commit: [d4d87596c11d6e3f8220b6d9677797c802af3a33] 
> UefiCpuPkg/PiSmmCpuDxeSmm: Enable NXE if it's supported


Finally, I have an independent question: why are we still adding *new*
0x66 size prefixes with "DB"? NASM supports 16-bit, 32-bit and 64-bit
assembly code in the same source file, and such prefixes can be encoded
symbolically [3]:

> Explicit address-size and operand-size prefixes A16, A32, A64, O16 and
> O32, O64 are provided -- one example of their use is given in chapter
> 10.

Thanks
Laszlo

[1] https://lists.01.org/pipermail/edk2-devel/2018-January/020582.html
[2] https://lists.01.org/pipermail/edk2-devel/2018-January/020586.html
[3] http://www.nasm.us/doc/nasmdoc3.html
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel

Reply via email to