On 12/09/2023 22:18, John Allen wrote:
> In the event that a guest process attempts to access memory that has
> been poisoned in response to a deferred uncorrected MCE, an AMD system
> will currently generate a SIGBUS error which will result in the entire
> guest being shutdown. Ideally, we only want to kill the guest process
> that accessed poisoned memory in this case.
> 
> This support has been included in qemu for Intel hosts for a long time,
> but there are a couple of changes needed for AMD hosts. First, we will
> need to expose the SUCCOR cpuid bit to guests. Second, we need to modify
> the MCE injection code to avoid Intel specific behavior when we are
> running on an AMD host.
> 

Is there any update with respect to this series?

John's series should fix MCE injection on AMD; as today it is just crashing the
guest (sadly) when an MCE happens in the hypervisor.

William, Paolo, I think the sort-of-dependency(?) of this where we block
migration if there was a poisoned page on is already in Peter's migration
tree[1] (CC'ed). So perhaps this series just needs John to resend it given that
it's been a couple months since v4?

[1]
https://lore.kernel.org/qemu-devel/20240130190640.139364-2-william.ro...@oracle.com/

> v2:
>   - Add "succor" feature word.
>   - Add case to kvm_arch_get_supported_cpuid for the SUCCOR feature.
> 
> v3:
>   - Reorder series. Only enable SUCCOR after bugs have been fixed.
>   - Introduce new patch ignoring AO errors.
> 
> v4:
>   - Remove redundant check for AO errors.
> 
> John Allen (2):
>   i386: Fix MCE support for AMD hosts
>   i386: Add support for SUCCOR feature
> 
> William Roche (1):
>   i386: Explicitly ignore unsupported BUS_MCEERR_AO MCE on AMD guest
> 
>  target/i386/cpu.c     | 18 +++++++++++++++++-
>  target/i386/cpu.h     |  4 ++++
>  target/i386/helper.c  |  4 ++++
>  target/i386/kvm/kvm.c | 28 ++++++++++++++++++++--------
>  4 files changed, 45 insertions(+), 9 deletions(-)
> 


Reply via email to