Laszlo, Thanks your testing. It seems that there is still some unknown issue existing.
I suggest to push this serial of patches firstly, because they have big progress to solve the AP crashed issue in https://bugzilla.tianocore.org/show_bug.cgi?id=216. I could submit another bug to handle "AP lost" issue. Thus, JIewen's or others' patches could be push as long as they have no additional issue except for "AP Lost:". I could follow up to fix "AP Lost" issue. Thanks! Jeff -----Original Message----- From: Laszlo Ersek [mailto:ler...@redhat.com] Sent: Saturday, November 12, 2016 3:49 AM To: Fan, Jeff Cc: edk2-de...@ml01.01.org; Yao, Jiewen; Paolo Bonzini Subject: Re: [edk2] [PATCH v2 0/3] Put AP into safe hlt-loop code on S3 path On 11/11/16 06:45, Jeff Fan wrote: > On S3 path, we will wake up APs to restore CPU context in > PiSmmCpuDxeSmm driver. In case, one NMI or SMI happens, APs may exit > from hlt state and execute the instruction after HLT instruction. > > But APs are not running on safe code, it leads OVMF S3 boot unstable. > > https://bugzilla.tianocore.org/show_bug.cgi?id=216 > > I tested real platform with 64bit DXE. > > v2: > 1. Make stack alignment per Laszlo's comment. > 2. Trim whitespace at end of end per Laszlo's comment. > 3. Update year mark in file header. > 4. Enhancement on InterlockedDecrement() per Paolo's comment. > > Jeff Fan (3): > UefiCpuPkg/PiSmmCpuDxeSmm: Put AP into safe hlt-loop code on S3 path > UefiCpuPkg/PiSmmCpuDxeSmm: Place AP to 32bit protected mode on S3 path > UefiCpuPkg/PiSmmCpuDxeSmm: Decrease mNumberToFinish in AP safe code > > UefiCpuPkg/PiSmmCpuDxeSmm/CpuS3.c | 33 +++++++++++++- > UefiCpuPkg/PiSmmCpuDxeSmm/Ia32/SmmFuncsArch.c | 29 +++++++++++- > UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.h | 15 +++++++ > UefiCpuPkg/PiSmmCpuDxeSmm/X64/SmmFuncsArch.c | 63 > ++++++++++++++++++++++++++- > 4 files changed, 136 insertions(+), 4 deletions(-) > Applied this locally to master (ffd6b0b1b65e) for testing. I tested the series with a suspend-resume loop -- not a busy loop, just manually. (So there was always one second or so between adjacent steps.) No crashes or emulation failures, but the "AP going lost" issue remains present -- sometimes Linux cannot bring up one of the four VCPUs after resume. In the Ia32 case, this "AP lost" symptom surfaced after the 6th resume. In the Ia32X64 case, I experienced the symptom after the 89th resume. Thanks Laszlo _______________________________________________ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel