So, I don't understand how the CR3s that are used by the APs when they
serve MP services PPI requests, throughout the PEI phase (*), have
anything to do with CpuS3.c's page tables (which live in SMRAM, AIUI).

[Jiewen] It is very tricky.
First, in normal boot, the SMM need prepare a CR3 as SMM page table, which is 
obvious.

In S3, the S3Resume calls AsmWriteCr3(SmmS3ResumeState->SmmS3Cr3) then jump to 
SmmS3ResumeState->SmmS3ResumeEntryPoint. Now BSP hold SmmS3Cr3 but in non-SMM 
mode.

In SmmRestoreCpu(), BSP calls EarlyInitializeCpu()/PrepareApStartupVector() in 
non-SMM mode. And mExchangeInfo->Cr3         = (UINT32) (AsmReadCr3 ()); Now AP 
holds SmmS3Cr3 in non-SMM mode. It is OK, because SMRAM is OPEN.

When SmmRelocateBases() is called, AP is waken up and does rebase. SmmS3Cr3 is 
used for AP in SMM. But it does not change the fact that SmmS3Cr3 is also used 
in non-SMM mode.

Later in InitializeCpu(), AP wakeup buffer is put to below 1M with SmmS3Cr3.

Thank you
Yao Jiewen

From: Laszlo Ersek [mailto:ler...@redhat.com]
Sent: Thursday, November 10, 2016 7:27 AM
To: Paolo Bonzini <pbonz...@redhat.com>
Cc: Yao, Jiewen <jiewen....@intel.com>; Kinney, Michael D 
<michael.d.kin...@intel.com>; Tian, Feng <feng.t...@intel.com>; 
edk2-de...@ml01.01.org; Zeng, Star <star.z...@intel.com>; Fan, Jeff 
<jeff....@intel.com>
Subject: Re: [edk2] [PATCH V2 0/6] Enable SMM page level protection.

On 11/09/16 23:59, Paolo Bonzini wrote:
>
>> Another question I have -- and I feel I should really know it, but I
>> don't... -- is *why* the APs are executing code from the page at
>> 0x9f000.
>
> This I can answer. :)
>
> The APs have done their INIT-SIPI-SIPI, and then went into the CLI;HLT;JMP
> loop.  When the AP exits SMM, it is in the JMP instruction.
>
> As suggested by Jiewen, edk2 could jump to a 32-bit loop that is _not_
> in the 0-640K area (perhaps it could be in what your doc calls the
> "permanent PEI memory for the S3 resume path"?).  After thinking a
> bit more about it, it seems simplest to me if CpuS3.c just uses
> SwitchStack or AsmDisablePaging64 at the end of MPRendezvousProcedure,
> to jump to a small stub like
>
>     POP EAX   ; pop return address
>     POP EAX   ; pop Context1 which is &mNumberToFinish
>     DEC [EAX]
>  1: CLI
>     HLT
>     JMP 1
>
>> I could be utterly and inexcusably wrong, but I think that the
>> RIP=0x9f0fd symptom is a red herring.
>
> I wouldn't call it a red herring.  After all, CR3 points to SMM
> exactly because the CR3 that was set up for the 0x9f000 stub is
> CpuS3.c's SMRAM page table root.

Hrmpf. The stub at 0x9f000 does not belong to PiSmmCpuDxeSmm. Regardless
of the boot path (normal boot or S3 resume), it belongs to CpuMpPei, and
it partakes in the implementation of the MP services PPI. It is
practically the "parking lot" for the APs when they are not executing
any MP job, submitted by an MP services PPI client.

So, I don't understand how the CR3s that are used by the APs when they
serve MP services PPI requests, throughout the PEI phase (*), have
anything to do with CpuS3.c's page tables (which live in SMRAM, AIUI).

(*) For example, OVMF's PlatformPei uses this service to program
MSR_IA32_FEATURE_CONTROL from fw_cfg. On the resume path too, that
occurs before we do the SMBASE relocation.

(I.e., before S3RestoreConfig2() in
"UefiCpuPkg/Universal/Acpi/S3Resume2Pei/S3Resume.c" calls
SmmRestoreCpu() in "UefiCpuPkg/PiSmmCpuDxeSmm/CpuS3.c", via
SmmS3ResumeState->SmmS3ResumeEntryPoint.)

When an AP executes RSM, its CR3 should automatically be restored to the
original (non-SMM) value, should it not? I mean I do remember the CR3
value from the QEMU register dump, but now I don't understand how that's
possible with SMM=0.

Sorry if I'm being dense :)

> What _is_ a red herring is KVM's trace showing a RSM instruction
> at RIP=0x9f0fd.  That is clearly bogus, RSM was rather the last
> instruction executed _before_ getting to that RIP.
>
>>>       vcpu#0  vcpu#1  vcpu#2  vcpu#3
>>>       ------  ------  ------  ------
>>>               enter           enter
>>>        enter    |     enter     |
>>>          |      |       |       |
>>>        leave    |       |       |
>>>             <--------------------------- BAD
>>>        enter    |       |       |
>>>          |      |       |       |
>>>        leave  leave   leave   leave
>>
>> I don't understand why we don't get horrible faults on the APs
>> *immediately* when the BSP closes/locks down SMRAM. Everything in SMRAM,
>> page tables, executable code, everything, will read as 0xff on QEMU. How
>> can the APs continue in SMM long enough to
>>
>> (a) time out and pull the BSP back into SMM,
>> (b) complete the rendezvous and exit SMM?
>
> Because the "0xff" only applies when you're out of SMM.  The three
> states (open, closed, closed/locked) only apply when you're not in SMM.
> While the AP is in SMM they are executing in a separate address space
> where SMRAM is "not closed".  (In QEMU that's a separate AddressSpace
> struct, smram_address_space in target-i386/kvm.c).

Sigh, in retrospect, this should have been obvious. :) Thanks for
pointing it out!

Laszlo

>> I think I sort of answered question (2). (Apologies if Paolo and Jiewen
>> explained the exact same thing before; I had to spell it out for
>> myself.) That leaves question (1) open. Why enter SMM in
>> S3ResumeExecuteBootScript() at all?
>>
>> Anyway, I think if the BSP and the APs are properly synchronized around
>> the SMI injections in S3ResumeExecuteBootScript(), then this bug is
>> fixed. In that case, the APs' RSMs will restore the full context for the
>> APs, including their sleep in the HLT instruction, in CpuMpPei's wakeup
>> buffer. The BSP will proceed, exit PEI (restoring the CpuMpPei wakeup
>> buffer -- but the APs will sleep on), and then Linux will bring up the
>> APs, after taking control.
>
> Agreed.
>
> Paolo
>
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel

Reply via email to