Re: [edk2-devel] [PATCH v9 09/10] OvmfPkg/CpuHotplugSmm: do actual CPU hot-eject

2021-03-16 Thread Laszlo Ersek
On 03/12/21 07:26, Ankur Arora wrote:
> Add logic in EjectCpu() to do the actual the CPU ejection.
> 
> On the BSP, ejection happens by first selecting the CPU via
> its QemuSelector and then sending the QEMU "eject" command.
> QEMU in-turn signals the remote VCPU thread which context-switches
> the CPU out of the SMI handler.
> 
> Meanwhile the CPU being ejected, waits around in its holding
> area until it is context-switched out. Note that it is possible
> that a slow CPU gets ejected before it reaches the wait loop.
> However, this would never happen before it has executed the
> "AllCpusInSync" loop in SmiRendezvous().
> It can mean that an ejected CPU does not execute code after
> that point but given that the CPU state will be destroyed by
> QEMU, the missed cleanup is no great loss.
> 
> Cc: Laszlo Ersek 
> Cc: Jordan Justen 
> Cc: Ard Biesheuvel 
> Cc: Igor Mammedov 
> Cc: Boris Ostrovsky 
> Cc: Aaron Young 
> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3132
> Signed-off-by: Ankur Arora 
> ---
> 
> Notes:
> Addresses the following comments from v8:
> 
> (1a,1b) CheckIfBsp(): get rid of ProcessorNum, document retval.
> (2) Line up IsBsp and ApicBaseMsr
> (3) s/ongoing SMI iteration/ongoing SMI/
> (4) Get rid of the allusions to alignment in the comment in EjectCpu().
> () Also reduce some of the repetitive detail in this comment.
> (5) EjectCpu(): reorder logic to cleanly separate the AP and the BSP 
> portions.
> (6) Get rid of unnecessary MemoryFence() between QemuCpuhpWrite
> and clearing of the eject status.
> (7) Change type of QemuSelector to %Lu in DEBUG statement
> (8) Get rid of the repetitive comment in SmmCpuFeaturesRendezvousExit().
> The necessary parts of this got moved to patch-7.
> 
>  OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h |   1 +
>  OvmfPkg/CpuHotplugSmm/CpuHotplug.c| 113 
> --
>  2 files changed, 108 insertions(+), 6 deletions(-)

Reviewed-by: Laszlo Ersek 

Thanks
Laszlo

> 
> diff --git a/OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h 
> b/OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h
> index 2ec7a107a64d..d0e83102c13f 100644
> --- a/OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h
> +++ b/OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h
> @@ -34,6 +34,7 @@
>  #define QEMU_CPUHP_STAT_ENABLEDBIT0
>  #define QEMU_CPUHP_STAT_INSERT BIT1
>  #define QEMU_CPUHP_STAT_REMOVE BIT2
> +#define QEMU_CPUHP_STAT_EJECT  BIT3
>  #define QEMU_CPUHP_STAT_FW_REMOVE  BIT4
>  
>  #define QEMU_CPUHP_RW_CMD_DATA   0x8
> diff --git a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c 
> b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
> index 2eeb4567a262..ae3abd525900 100644
> --- a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
> +++ b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
> @@ -18,6 +18,7 @@
>  #include  // CPU_HOT_EJECT_DATA
>  #include // EFI_MM_CPU_IO_PROTOCOL
>  #include   // EFI_SMM_CPU_SERVICE_PROTOCOL
> +#include  // MSR_IA32_APIC_BASE_REGISTER
>  #include// EFI_STATUS
>  
>  #include "ApicId.h"  // APIC_ID
> @@ -193,12 +194,40 @@ RevokeNewSlot:
>  }
>  
>  /**
> +  EjectCpu needs to know the BSP at SMI exit at a point when
> +  some of the EFI_SMM_CPU_SERVICE_PROTOCOL state has been torn
> +  down.
> +  Reuse the logic from OvmfPkg::PlatformSmmBspElection() to
> +  do that.
> +
> +  @retval TRUE   If the CPU executing this function is the BSP.
> +
> +  @retval FALSE  If the CPU executing this function is an AP.
> +**/
> +STATIC
> +BOOLEAN
> +CheckIfBsp (
> +  VOID
> +  )
> +{
> +  MSR_IA32_APIC_BASE_REGISTER ApicBaseMsr;
> +  BOOLEAN IsBsp;
> +
> +  ApicBaseMsr.Uint64 = AsmReadMsr64 (MSR_IA32_APIC_BASE);
> +  IsBsp = (BOOLEAN)(ApicBaseMsr.Bits.BSP == 1);
> +  return IsBsp;
> +}
> +
> +/**
>CPU Hot-eject handler, called from SmmCpuFeaturesRendezvousExit()
>on each CPU at exit from SMM.
>  
> -  If, the executing CPU is not being ejected, nothing to be done.
> +  If, the executing CPU is neither the BSP, nor being ejected, nothing
> +  to be done.
>If, the executing CPU is being ejected, wait in a halted loop
>until ejected.
> +  If, the executing CPU is the BSP, set QEMU CPU status to eject
> +  for CPUs being ejected.
>  
>@param[in] ProcessorNum  ProcessorNum denotes the CPU exiting SMM,
> and will be used as an index into
> @@ -214,6 +243,83 @@ EjectCpu (
>  {
>UINT64 QemuSelector;
>  
> +  if (CheckIfBsp ()) {
> +UINT32 Idx;
> +
> +for (Idx = 0; Idx < mCpuHotEjectData->ArrayLength; Idx++) {
> +  UINT64 QemuSelector;
> +
> +  QemuSelector = mCpuHotEjectData->QemuSelectorMap[Idx];
> +
> +  if (QemuSelector != CPU_EJECT_QEMU_SELECTOR_INVALID) {
> +//
> +// This to-be-ejected-CPU has already received the BSP's SMI exit
> +// signal and

Re: [edk2-devel] [PATCH v9 09/10] OvmfPkg/CpuHotplugSmm: do actual CPU hot-eject

2021-03-16 Thread Laszlo Ersek
On 03/12/21 07:26, Ankur Arora wrote:

> @@ -214,6 +243,83 @@ EjectCpu (
>  {
>UINT64 QemuSelector;
>  
> +  if (CheckIfBsp ()) {
> +UINT32 Idx;
> +
> +for (Idx = 0; Idx < mCpuHotEjectData->ArrayLength; Idx++) {
> +  UINT64 QemuSelector;

Visual Studio warns that the inner QemuSelector declaration shadows the
outer one. I'm going to remove the inner declaration, due to:

> +
> +  QemuSelector = mCpuHotEjectData->QemuSelectorMap[Idx];
> +
> +  if (QemuSelector != CPU_EJECT_QEMU_SELECTOR_INVALID) {
> +//
> +// This to-be-ejected-CPU has already received the BSP's SMI exit
> +// signal and will execute SmmCpuFeaturesRendezvousExit()
> +// followed by this callback or is already penned in the
> +// CpuSleep() loop below.
> +//
> +// Tell QEMU to context-switch it out.
> +//
> +QemuCpuhpWriteCpuSelector (mMmCpuIo, (UINT32) QemuSelector);
> +QemuCpuhpWriteCpuStatus (mMmCpuIo, QEMU_CPUHP_STAT_EJECT);
> +
> +//
> +// Now that we've ejected the CPU corresponding to 
> QemuSelectorMap[Idx],
> +// clear its eject status to ensure that an invalid future SMI does
> +// not end up trying a spurious eject or a newly hotplugged CPU does
> +// not get penned in the CpuSleep() loop.
> +//
> +// Note that the QemuCpuhpWriteCpuStatus() command above is a write 
> to
> +// a different address space and uses the EFI_MM_CPU_IO_PROTOCOL.
> +//
> +// This means that we are guaranteed that the following assignment
> +// will not be reordered before the eject. And, so we can safely
> +// do this write here.
> +//
> +mCpuHotEjectData->QemuSelectorMap[Idx] =
> +  CPU_EJECT_QEMU_SELECTOR_INVALID;
> +
> +DEBUG ((DEBUG_INFO, "%a: Unplugged ProcessorNum %u, "
> +  "QemuSelector %Lu\n", __FUNCTION__, Idx, QemuSelector));
> +  }
> +}
> +
> +//
> +// We are done until the next hot-unplug; clear the handler.
> +//
> +// mCpuHotEjectData->Handler is a NOP for any CPU not under ejection.
> +// So, once we are done with all the ejections, we can safely reset it
> +// here since any CPU dereferencing it would only see either the old
> +// or the new value (since it is aligned at a natural boundary.)
> +//
> +mCpuHotEjectData->Handler = NULL;
> +return;
> +  }
> +
> +  //
> +  // Reached only on APs
> +  //
> +
> +  //
> +  // mCpuHotEjectData->QemuSelectorMap[ProcessorNum] is updated
> +  // on the BSP in the ongoing SMI at two places:
> +  //
> +  // - UnplugCpus() where the BSP determines if a CPU is under ejection
> +  //   or not. As a comment in UnplugCpus() at set-up, and in
> +  //   SmmCpuFeaturesRendezvousExit() where it is dereferenced describe,
> +  //   any such updates are guaranteed to be ordered-before the
> +  //   dereference below.
> +  //
> +  // - EjectCpu() on the BSP (above) updates QemuSelectorMap[ProcessorNum]
> +  //   for a CPU once it's ejected.
> +  //
> +  //   The CPU under ejection: might be executing anywhere between the
> +  //   AllCpusInSync loop in SmiRendezvous(), to about to dereference
> +  //   QemuSelectorMap[ProcessorNum].
> +  //   As described in the comment above where we do the reset, this
> +  //   is not a problem since the ejected CPU never sees the after value.
> +  //   CPUs not-under ejection: never see any changes so they are fine.
> +  //
>QemuSelector = mCpuHotEjectData->QemuSelectorMap[ProcessorNum];

this reference being to the outer one.

Thanks
Laszlo

>if (QemuSelector == CPU_EJECT_QEMU_SELECTOR_INVALID) {
>  return;
> @@ -495,11 +601,6 @@ CpuHotplugMmi (
>if (EFI_ERROR (Status)) {
>  goto Fatal;
>}
> -  if (ToUnplugCount > 0) {
> -DEBUG ((DEBUG_ERROR, "%a: hot-unplug is not supported yet\n",
> -  __FUNCTION__));
> -goto Fatal;
> -  }
>  
>if (PluggedCount > 0) {
>  Status = ProcessHotAddedCpus (mPluggedApicIds, PluggedCount);
> 



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#72870): https://edk2.groups.io/g/devel/message/72870
Mute This Topic: https://groups.io/mt/81273610/21656
Group Owner: devel+ow...@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-