On 1/12/23 09:28, Laszlo Ersek wrote:
> In QEMU v5.1.0, the CPU hotplug register block misbehaves: the negotiation
> protocol is (effectively) broken such that it suggests that switching from
> the legacy interface to the modern interface works, but in reality the
> switch never happens. The symptom has been witnessed when using TCG
> acceleration; KVM seems to mask the issue. The issue persists with the
> following (latest) stable QEMU releases: v5.2.0, v6.2.0, v7.2.0. Currently
> there is no stable release that addresses the problem.
> 
> The QEMU bug confuses the Present and Possible counting in function
> PlatformMaxCpuCountInitialization(), in
> "OvmfPkg/Library/PlatformInitLib/Platform.c". OVMF ends up with Present=0
> Possible=1. This in turn further confuses MpInitLib in UefiCpuPkg (hence
> firmware-time multiprocessing will be broken). Worse, CPU hot(un)plug with
> SMI will be summarily broken in OvmfPkg/CpuHotplugSmm, which (considering
> the privilege level of SMM) is not that great.
> 
> Detect the issue in PlatformMaxCpuCountInitialization(), and print an
> error message and *hang* if the issue is present.
> 
> The problem was originally reported by Ard [0]. We analyzed it at [1] and
> [2]. A QEMU patch was sent at [3]; now merged as commit dab30fbef389
> ("acpi: cpuhp: fix guest-visible maximum access size to the legacy reg
> block", 2023-01-08), to be included in QEMU v8.0.0.
> 
> [0] https://bugzilla.tianocore.org/show_bug.cgi?id=4234#c2
> 
> [1] https://bugzilla.tianocore.org/show_bug.cgi?id=4234#c3
> 
> [2] IO port write width clamping differs between TCG and KVM
>     
> http://mid.mail-archive.com/aaedee84-d3ed-a4f9-21e7-d221a28d1683@redhat.com
>     https://lists.gnu.org/archive/html/qemu-devel/2023-01/msg00199.html
> 
> [3] acpi: cpuhp: fix guest-visible maximum access size to the legacy reg block
>     http://mid.mail-archive.com/20230104090138.214862-1-lersek@redhat.com
>     https://lists.gnu.org/archive/html/qemu-devel/2023-01/msg00278.html
> 
> NOTE: PlatformInitLib is used in the following platform DSCs:
> 
>   OvmfPkg/AmdSev/AmdSevX64.dsc
>   OvmfPkg/CloudHv/CloudHvX64.dsc
>   OvmfPkg/IntelTdx/IntelTdxX64.dsc
>   OvmfPkg/Microvm/MicrovmX64.dsc
>   OvmfPkg/OvmfPkgIa32.dsc
>   OvmfPkg/OvmfPkgIa32X64.dsc
>   OvmfPkg/OvmfPkgX64.dsc
> 
> but I can only test this change with the last three platforms, running on
> QEMU.
> 
> Test results:
> 
>   TCG  QEMU     OVMF     result
>        patched  patched
>   ---  -------  -------  -------------------------------------------------
>   0    0        0        CPU counts OK (KVM masks the QEMU bug)
>   0    0        1        CPU counts OK (KVM masks the QEMU bug)
>   0    1        0        CPU counts OK (QEMU fix, but KVM masks the QEMU
>                          bug anyway)
>   0    1        1        CPU counts OK (QEMU fix, but KVM masks the QEMU
>                          bug anyway)
>   1    0        0        boot with broken CPU counts (original QEMU bug)
>   1    0        1        broken CPU count caught (boot hangs)
>   1    1        0        CPU counts OK (QEMU fix)
>   1    1        1        CPU counts OK (QEMU fix)
> 
> Cc: Ard Biesheuvel <ardb+tianoc...@kernel.org>
> Cc: Brijesh Singh <brijesh.si...@amd.com>
> Cc: Erdem Aktas <erdemak...@google.com>
> Cc: Gerd Hoffmann <kra...@redhat.com>
> Cc: James Bottomley <j...@linux.ibm.com>
> Cc: Jiewen Yao <jiewen....@intel.com>
> Cc: Jordan Justen <jordan.l.jus...@intel.com>
> Cc: Min Xu <min.m...@intel.com>
> Cc: Oliver Steffen <ostef...@redhat.com>
> Cc: Sebastien Boeuf <sebastien.bo...@intel.com>
> Cc: Tom Lendacky <thomas.lenda...@amd.com>
> Bugzilla: https://bugzilla.tianocore.org/show_bug.cgi?id=4250
> Reviewed-by: Gerd Hoffmann <kra...@redhat.com>
> Signed-off-by: Laszlo Ersek <ler...@redhat.com>
> ---
> 
> Notes:
>     v2:
>     
>     - V1 was at
>       <20230104151234.286030-1-lersek@redhat.com">http://mid.mail-archive.com/20230104151234.286030-1-lersek@redhat.com>.
>     
>     - Repo: <https://pagure.io/lersek/edk2.git>, branch:
>       cpuhp-reg-catch-4250-v2
>     
>     - Remove KVM as a proposed workaround from the error message, because in
>       the QEMU discussion, we had found that the KVM accelerator's behavior
>       in QEMU (masking the problem) was not right, and that a fix for that
>       had been in progress for quite some time.
>     
>     - Add the QEMU commit hash to the commit message, the code comment, and
>       the error message.
>     
>     - Pick up Gerd's R-b; add Oliver to the Cc list.
> 
>  OvmfPkg/Library/PlatformInitLib/Platform.c | 35 ++++++++++++++++++++
>  1 file changed, 35 insertions(+)
> 
> diff --git a/OvmfPkg/Library/PlatformInitLib/Platform.c 
> b/OvmfPkg/Library/PlatformInitLib/Platform.c
> index 3e13c5d4b34f..13348afb4890 100644
> --- a/OvmfPkg/Library/PlatformInitLib/Platform.c
> +++ b/OvmfPkg/Library/PlatformInitLib/Platform.c
> @@ -541,6 +541,41 @@ PlatformMaxCpuCountInitialization (
>          ASSERT (Selected == Possible || Selected == 0);
>        } while (Selected > 0);
>  
> +      //
> +      // Sanity check: we need at least 1 present CPU (CPU#0 is always 
> present).
> +      //
> +      // The legacy-to-modern switching of the CPU hotplug register block got
> +      // broken (for TCG) in QEMU v5.1.0. Refer to "IO port write width 
> clamping
> +      // differs between TCG and KVM" at
> +      // 
> <http://mid.mail-archive.com/aaedee84-d3ed-a4f9-21e7-d221a28d1683@redhat.com>
> +      // or at
> +      // 
> <https://lists.gnu.org/archive/html/qemu-devel/2023-01/msg00199.html>.
> +      //
> +      // QEMU received the fix in commit dab30fbef389 ("acpi: cpuhp: fix
> +      // guest-visible maximum access size to the legacy reg block",
> +      // 2023-01-08), to be included in QEMU v8.0.0.
> +      //
> +      // If we're affected by this QEMU bug, then we must not continue: it
> +      // confuses the multiprocessing in UefiCpuPkg/Library/MpInitLib, and
> +      // breaks CPU hot(un)plug with SMI in OvmfPkg/CpuHotplugSmm.
> +      //
> +      if (Present == 0) {
> +        DEBUG ((
> +          DEBUG_ERROR,
> +          "%a: Broken CPU hotplug register block: Present=%u Possible=%u.\n"
> +          "%a: Update QEMU to v8, or to stable with dab30fbef389 
> backported.\n"
> +          "%a: Refer to "
> +          "<https://bugzilla.tianocore.org/show_bug.cgi?id=4250>.\n",
> +          __FUNCTION__,
> +          Present,
> +          Possible,
> +          __FUNCTION__,
> +          __FUNCTION__
> +          ));
> +        ASSERT (FALSE);
> +        CpuDeadLoop ();
> +      }
> +
>        //
>        // Sanity check: fw_cfg and the modern CPU hotplug interface should
>        // return the same boot CPU count.
> 

please do with this what you will



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#98388): https://edk2.groups.io/g/devel/message/98388
Mute This Topic: https://groups.io/mt/96218818/21656
Group Owner: devel+ow...@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Reply via email to