Laszlo,

Thanks your testing. It seems that there is still some unknown issue existing.

I suggest to push this serial of patches firstly, because they have big 
progress to solve the AP crashed issue in 
https://bugzilla.tianocore.org/show_bug.cgi?id=216.

I could submit another bug to handle "AP lost" issue.  Thus, JIewen's  or 
others' patches could be push as long as they have no additional issue except 
for "AP Lost:".

I could follow up to fix "AP Lost" issue.

Thanks!
Jeff


-----Original Message-----
From: Laszlo Ersek [mailto:ler...@redhat.com] 
Sent: Saturday, November 12, 2016 3:49 AM
To: Fan, Jeff
Cc: edk2-de...@ml01.01.org; Yao, Jiewen; Paolo Bonzini
Subject: Re: [edk2] [PATCH v2 0/3] Put AP into safe hlt-loop code on S3 path

On 11/11/16 06:45, Jeff Fan wrote:
> On S3 path, we will wake up APs to restore CPU context in 
> PiSmmCpuDxeSmm driver. In case, one NMI or SMI happens, APs may exit 
> from hlt state and execute the instruction after HLT instruction.
> 
> But APs are not running on safe code, it leads OVMF S3 boot unstable.
> 
> https://bugzilla.tianocore.org/show_bug.cgi?id=216
> 
> I tested real platform with 64bit DXE.
> 
> v2:
>   1. Make stack alignment per Laszlo's comment.
>   2. Trim whitespace at end of end per Laszlo's comment.
>   3. Update year mark in file header.
>   4. Enhancement on InterlockedDecrement() per Paolo's comment.
> 
> Jeff Fan (3):
>   UefiCpuPkg/PiSmmCpuDxeSmm: Put AP into safe hlt-loop code on S3 path
>   UefiCpuPkg/PiSmmCpuDxeSmm: Place AP to 32bit protected mode on S3 path
>   UefiCpuPkg/PiSmmCpuDxeSmm: Decrease mNumberToFinish in AP safe code
> 
>  UefiCpuPkg/PiSmmCpuDxeSmm/CpuS3.c             | 33 +++++++++++++-
>  UefiCpuPkg/PiSmmCpuDxeSmm/Ia32/SmmFuncsArch.c | 29 +++++++++++-
>  UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.h    | 15 +++++++
>  UefiCpuPkg/PiSmmCpuDxeSmm/X64/SmmFuncsArch.c  | 63 
> ++++++++++++++++++++++++++-
>  4 files changed, 136 insertions(+), 4 deletions(-)
> 

Applied this locally to master (ffd6b0b1b65e) for testing. I tested the series 
with a suspend-resume loop -- not a busy loop, just manually. (So there was 
always one second or so between adjacent steps.)

No crashes or emulation failures, but the "AP going lost" issue remains present 
-- sometimes Linux cannot bring up one of the four VCPUs after resume.

In the Ia32 case, this "AP lost" symptom surfaced after the 6th resume.

In the Ia32X64 case, I experienced the symptom after the 89th resume.

Thanks
Laszlo
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel

Reply via email to