On Wed, Jun 10, 2026 at 05:00:11PM +0530, Naveen N Rao wrote:
> [+Sean]
> 
> Hi Mike,
> 
> On Tue, Jun 09, 2026 at 07:35:46PM -0500, Michael Roth wrote:
> > On Tue, Jun 02, 2026 at 12:42:13PM +0530, Naveen N Rao (AMD) wrote:
> > > KVM commit 66155de93bcf ("KVM: x86: Disallow read-only memslots for
> > > SEV-ES and SEV-SNP (and TDX)"), and the subsequent commit d30d9ee94cc0
> > > ("KVM: x86: Only advertise KVM_CAP_READONLY_MEM when supported by VM")
> > > stopped advertising KVM_CAP_READONLY_MEM support for encrypted guests
> > > (KVM_X86_SEV_ES_VM and KVM_X86_SNP_VM), but not for KVM_X86_DEFAULT_VM
> > > type SEV-ES guests. As a result of this, it is no longer possible to
> > > start SEV-ES guests with any SEV feature enabled (in particular,
> > > debug-swap) with pflash devices.
> > > 
> > > This is an issue since SEV-ES guests have historically used pflash
> > > devices for OVMF. Guests on older KVM+Qemu are able to enable debug-swap
> > > while using pflash devices, so work around the KVM limitation by
> > > switching to using a VMA-based write protection. This allows
> > > well-behaved SEV-ES guests to continue to work with pflash devices and
> > > enable debug-swap. Mis-behaving guests trying to write to the protected
> > > OVMF area will be killed.
> > 
> > Based on Sean's description, a write access to a read-only memslot would
> > cause the vCPU to permanently spin on #NPFs if trying to write to it as
> > MMIO due to #VC handler not triggering, and that's why we don't support
> > read-only memslots. But since SEV-ES was previously working with pflash,
> > it seems like it does not rely on this functionality...
> 
> Right, normal well-behaved SEV-ES/SNP guests work just fine as they 
> don't write to any of the read-only areas.
> 
> > 
> > So if OVMF isn't writing to write-protected memory, then it wouldn't be
> > triggering the MMIO emulation path in the first place. And if we don't
> > care about enabling the emulation path in this case... then I'm not sure
> > the original reasons for not allowing it for SEV-ES/SNP are applicable.
> 
> Guest (not just OVMF) could try and write to the read-only area 
> triggering this issue. A simple write to 0xc0000 from within the guest 
> triggers this.

Is that still true even with this patch?

  commit 0f4a1e80989aca185d955fcd791d7750082044a2
  Author: Kevin Loughlin <[email protected]>
  Date:   Wed Mar 13 12:15:46 2024 +0000
  
      x86/sev: Skip ROM range scans and validation for SEV-SNP guests
      
      SEV-SNP requires encrypted memory to be validated before access.
      Because the ROM memory range is not part of the e820 table, it is not
      pre-validated by the BIOS. Therefore, if a SEV-SNP guest kernel wishes
      to access this range, the guest must first validate the range.
      
      The current SEV-SNP code does indeed scan the ROM range during early
      boot and thus attempts to validate the ROM range in probe_roms().
      However, this behavior is neither sufficient nor necessary for the
      following reasons:

      ...

but in that case, those private accesses didn't work because they were
accessing legacy MMIO regions as private/encrypted even though none of the
option ROMs were loaded into memory as encrypted, so they're basically just
garbage/legacy regions we try to completely ignore on the guest-side now and
any lingering cases should probably get the same treatment.

It would be nice to still be able to catch write accesses....but I think we
still could (with the kernel changes discussed in my reply to Tom) if we
really wanted that. But is that really a hard requirement? Personally, the
-bios vs pflash argument thing makes this feel justified since -bios also
let's the writes through silently, but maybe we can do better with kernel
changes.

> 
> > 
> > It feels like KVM_CAP_READONLY_MEM is more like KVM_CAP_EMULATE_ON_WRITE,
> > whereas we literally just need as actually slot that's permanently mapped
> > in the NPT without write access.
> > 
> > Is that an accurate summary of the situation?
> 
> Yes, that sounds right to me.
> 
> > 
> > If so, maybe we can introduce a KVM_CAP_READONLY_NO_MMIO that captures
> > this and simply errors out if it hits the KVM_PFN_ERR_RO_FAULT.
> 
> That would certainly work.
> 
> > Or, for
> > a QEMU-specific workaround, just have a pflash implementation that doesn't
> > rely on KVM_MEM_READONLY for cases like this where we don't need MMIO
> > emulation.
> 
> Not sure I follow that... are you suggesting that pflash use regular RW 
> memslots and just let the write through?

Yes, isn't that basically what we're getting with -bios? At least this
way we don't have the awkwardness of needing to randomly switch from -bios
to pflash based on what SEV features the user selects, which is pretty
bad.

But that was more of a last resort, maybe we haven't yet bottomed out on
whether we could do things a bit more nicely with some kernel help as
discussed elsewhere in this thread.

-Mike

> 
> 
> Thanks,
> Naveen
> 
> > There's actually another case in hw/nvram/nrf51_nvm.c where this
> > would be applicable. I guess it could be done automatically for the
> > confidential VM case to retain cmdline compatibility...though you're
> > wanting to add the debugswap feature anyway so not sure how important that
> > aspect is.
> > 
> > Thanks,
> > 
> > Mike
> >
> > > 
> > > Enable VMA protection and set the memory to be RO when adding the KVM
> > > memory slot. Because pflash devices support command-mode, change VMA
> > > protection to RW when tearing down the KVM memory slot. KVM
> > > SEV_LAUNCH_UPDATE also requires memory to be RW, so disable the
> > > protection when calling that.
> > > 
> > > Print a warning when switching to VMA-based protection so that it is
> > > clear that KVM itself isn't supporting readonly memory, and that a
> > > workaround is in place. Users can plan on switching to using '-bios'.
> > > 
> > > Finally, drop the check rejecting SEV-ES guests with SEV features so 
> > > that debug-swap can be enabled.
> > > 
> > > 
> > > Signed-off-by: Naveen N Rao (AMD) <[email protected]>
> > > ---
> > > Background discussion on this issue:
> > > http://lore.kernel.org/r/[email protected]
> > > 
> > > This series depends on VMSA features support:
> > > http://lore.kernel.org/r/[email protected]
> > > 
> > > 
> > > - Naveen
> > > 
> > > 
> > >  include/system/kvm.h     |  5 ++++
> > >  include/system/kvm_int.h |  1 +
> > >  accel/kvm/kvm-all.c      | 52 +++++++++++++++++++++++++++++++++++++++-
> > >  hw/i386/pc_sysfw.c       | 19 +++++++++------
> > >  target/i386/sev.c        | 21 +++++++++++-----
> > >  5 files changed, 84 insertions(+), 14 deletions(-)
> > > 
> > > diff --git a/include/system/kvm.h b/include/system/kvm.h
> > > index 5fa33eddda38..585058bd6f1c 100644
> > > --- a/include/system/kvm.h
> > > +++ b/include/system/kvm.h
> > > @@ -555,6 +555,8 @@ uint32_t kvm_dirty_ring_size(void);
> > >  
> > >  void kvm_mark_guest_state_protected(void);
> > >  
> > > +void kvm_enable_ro_mem_vma_protection(void);
> > > +
> > >  /**
> > >   * kvm_hwpoisoned_mem - indicate if there is any hwpoisoned page
> > >   * reported for the VM.
> > > @@ -568,6 +570,9 @@ int kvm_set_memory_attributes_shared(hwaddr start, 
> > > uint64_t size);
> > >  
> > >  int kvm_convert_memory(hwaddr start, hwaddr size, bool to_private);
> > >  
> > > +void kvm_set_memory_readonly(void *addr, size_t len);
> > > +void kvm_set_memory_readwrite(void *addr, size_t len);
> > > +
> > >  /* argument to vmfd change notifier */
> > >  typedef struct VmfdChangeNotifier {
> > >      int vmfd;
> > > diff --git a/include/system/kvm_int.h b/include/system/kvm_int.h
> > > index 0876aac938d3..0e083a56ce2a 100644
> > > --- a/include/system/kvm_int.h
> > > +++ b/include/system/kvm_int.h
> > > @@ -123,6 +123,7 @@ struct KVMState
> > >      OnOffAuto kernel_irqchip_split;
> > >      bool sync_mmu;
> > >      bool guest_state_protected;
> > > +    bool guest_wants_ro_mem_vma_protection;
> > >      uint64_t manual_dirty_log_protect;
> > >      /*
> > >       * Older POSIX says that ioctl numbers are signed int, but in
> > > diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> > > index 96f90ebb240f..4208df5b25ac 100644
> > > --- a/accel/kvm/kvm-all.c
> > > +++ b/accel/kvm/kvm-all.c
> > > @@ -1629,6 +1629,42 @@ int kvm_set_memory_attributes_shared(hwaddr start, 
> > > uint64_t size)
> > >      return kvm_set_memory_attributes(start, size, 0);
> > >  }
> > >  
> > > +static void kvm_set_memory_flags(void *addr, size_t len, int flags)
> > > +{
> > > +    if (mprotect(addr, len, flags)) {
> > > +        error_report("failed to apply memory protection "
> > > +                     "(0x%" HWADDR_PRIx "+0x%" PRIx64 ") error '%s'",
> > > +                     (hwaddr)addr, len, strerror(errno));
> > > +        exit(1);
> > > +    }
> > > +}
> > > +
> > > +void kvm_set_memory_readonly(void *addr, size_t len)
> > > +{
> > > +    if (kvm_state->guest_wants_ro_mem_vma_protection) {
> > > +        kvm_set_memory_flags(addr, len, PROT_READ);
> > > +    }
> > > +}
> > > +
> > > +void kvm_set_memory_readwrite(void *addr, size_t len)
> > > +{
> > > +    if (kvm_state->guest_wants_ro_mem_vma_protection) {
> > > +        kvm_set_memory_flags(addr, len, PROT_READ | PROT_WRITE);
> > > +    }
> > > +}
> > > +
> > > +static bool kvm_mem_wants_vma_protection(MemoryRegion *mr)
> > > +{
> > > +    if (!memory_region_is_ram(mr) &&
> > > +        (mr->readonly || mr->rom_device) &&
> > > +        !kvm_readonly_mem_allowed &&
> > > +        kvm_state->guest_wants_ro_mem_vma_protection) {
> > > +            return true;
> > > +    }
> > > +
> > > +    return false;
> > > +}
> > > +
> > >  /* Called with KVMMemoryListener.slots_lock held */
> > >  static void kvm_set_phys_mem(KVMMemoryListener *kml,
> > >                               MemoryRegionSection *section, bool add)
> > > @@ -1642,7 +1678,8 @@ static void kvm_set_phys_mem(KVMMemoryListener *kml,
> > >      void *ram;
> > >  
> > >      if (!memory_region_is_ram(mr)) {
> > > -        if (writable || !kvm_readonly_mem_allowed) {
> > > +        if (writable || (!kvm_readonly_mem_allowed &&
> > > +                         !kvm_state->guest_wants_ro_mem_vma_protection)) 
> > > {
> > >              return;
> > >          } else if (!mr->romd_mode) {
> > >              /* If the memory device is not in romd_mode, then we 
> > > actually want
> > > @@ -1697,6 +1734,10 @@ static void kvm_set_phys_mem(KVMMemoryListener 
> > > *kml,
> > >                  kvm_slot_sync_dirty_pages(mem);
> > >              }
> > >  
> > > +            if (kvm_mem_wants_vma_protection(mr)) {
> > > +                kvm_set_memory_readwrite(mem->ram, mem->memory_size);
> > > +            }
> > > +
> > >              /* unregister the slot */
> > >              g_free(mem->dirty_bmap);
> > >              mem->dirty_bmap = NULL;
> > > @@ -1746,6 +1787,10 @@ static void kvm_set_phys_mem(KVMMemoryListener 
> > > *kml,
> > >              }
> > >          }
> > >  
> > > +        if (kvm_mem_wants_vma_protection(mr)) {
> > > +            kvm_set_memory_readonly(mem->ram, mem->memory_size);
> > > +        }
> > > +
> > >          start_addr += slot_size;
> > >          ram_start_offset += slot_size;
> > >          ram += slot_size;
> > > @@ -4771,6 +4816,11 @@ void kvm_mark_guest_state_protected(void)
> > >      kvm_state->guest_state_protected = true;
> > >  }
> > >  
> > > +void kvm_enable_ro_mem_vma_protection(void)
> > > +{
> > > +    kvm_state->guest_wants_ro_mem_vma_protection = true;
> > > +}
> > > +
> > >  int kvm_create_guest_memfd(uint64_t size, uint64_t flags, Error **errp)
> > >  {
> > >      int fd;
> > > diff --git a/hw/i386/pc_sysfw.c b/hw/i386/pc_sysfw.c
> > > index 1a41a5972bd0..9590458e00c5 100644
> > > --- a/hw/i386/pc_sysfw.c
> > > +++ b/hw/i386/pc_sysfw.c
> > > @@ -254,13 +254,18 @@ void pc_system_firmware_init(PCMachineState *pcms,
> > >          }
> > >      } else {
> > >          if (kvm_enabled() && !kvm_readonly_mem_enabled()) {
> > > -            /*
> > > -             * Older KVM cannot execute from device memory. So, flash
> > > -             * memory cannot be used unless the readonly memory kvm
> > > -             * capability is present.
> > > -             */
> > > -            error_report("pflash with kvm requires KVM readonly memory 
> > > support");
> > > -            exit(1);
> > > +            if (sev_es_enabled() && !sev_snp_enabled()) {
> > > +                warn_report("pflash not supported with SEV-ES guests, "
> > > +                            "attempting VMA based protection");
> > > +            } else {
> > > +                /*
> > > +                 * Older KVM cannot execute from device memory. So, flash
> > > +                 * memory cannot be used unless the readonly memory kvm
> > > +                 * capability is present.
> > > +                 */
> > > +                error_report("pflash with kvm requires KVM readonly 
> > > memory support");
> > > +                exit(1);
> > > +            }
> > >          }
> > >  
> > >          pc_system_flash_map(pcms, rom_memory);
> > > diff --git a/target/i386/sev.c b/target/i386/sev.c
> > > index f04ae4e91f3e..82cf2c562729 100644
> > > --- a/target/i386/sev.c
> > > +++ b/target/i386/sev.c
> > > @@ -550,12 +550,6 @@ static int check_sev_features(SevCommonState 
> > > *sev_common, uint64_t sev_features,
> > >              return -1;
> > >          }
> > >      } else {
> > > -        if (sev_features && sev_es_enabled()) {
> > > -            error_setg(errp,
> > > -                       "%s: SEV features are not supported with SEV-ES 
> > > at this time",
> > > -                       __func__);
> > > -            return -1;
> > > -        }
> > >          if (sev_features & SVM_SEV_FEAT_SNP_ACTIVE) {
> > >              error_setg(errp,
> > >                         "%s: SEV_SNP is not enabled but is enabled in 
> > > VMSA sev_features",
> > > @@ -2024,6 +2018,15 @@ static int sev_kvm_init(ConfidentialGuestSupport 
> > > *cgs, Error **errp)
> > >          return -1;
> > >      }
> > >  
> > > +    /*
> > > +     * Use VMA-based protection for SEV-ES guests that enable any
> > > +     * SEV feature, since KVM does not advertise readonly memory
> > > +     * support for non-default type SEV guests.
> > > +     */
> > > +    if (sev_es_enabled() && SEV_COMMON(cgs)->sev_features) {
> > > +        kvm_enable_ro_mem_vma_protection();
> > > +    }
> > > +
> > >      if (!cgs->ready) {
> > >          /*
> > >           * SEV uses these notifiers to register/pin pages prior to guest 
> > > use,
> > > @@ -2111,7 +2114,13 @@ sev_encrypt_flash(hwaddr gpa, uint8_t *ptr, 
> > > uint64_t len, Error **errp)
> > >      if (sev_check_state(sev_common, SEV_STATE_LAUNCH_UPDATE)) {
> > >          int ret;
> > >  
> > > +        /*
> > > +         * KVM requires these pages to be RW, so remove VMA RO protection
> > > +         * for the duration of SEV_LAUNCH_UPDATE if using SEV features.
> > > +         */
> > > +        kvm_set_memory_readwrite(ptr, len);
> > >          ret = klass->launch_update_data(sev_common, gpa, ptr, len, errp);
> > > +        kvm_set_memory_readonly(ptr, len);
> > >          if (ret < 0) {
> > >              return ret;
> > >          }
> > > 
> > > base-commit: 5611a9268dae7b7ff99d478ed134052a9fc7e9f7
> > > prerequisite-patch-id: dc27ad6297d47d063b04fa797c1b8203ee97d9c8
> > > prerequisite-patch-id: 603eff49233c4b0483e7c405754b95aa455dd38c
> > > prerequisite-patch-id: d4085e72ecfb0fbf977f7358d1edd29951b93784
> > > prerequisite-patch-id: f6f201825b1e56f89a87a26bc457b3e6018aee49
> > > prerequisite-patch-id: 08d79cec4c3f117178be8f6c866ff1be08e971f3
> > > prerequisite-patch-id: c112bccab9ab9cee2d0227516fc857590b99a75b
> > > prerequisite-patch-id: f42fd829d0ea4909537e722477057a4013a247ab
> > > prerequisite-patch-id: 67136368ed1d2fa0ae55fee4368a7bd1fe394368
> > > prerequisite-patch-id: 08349bb1e0e11ee1518a9041771302c97866b5cd
> > > -- 
> > > 2.54.0
> > > 
> > > 

Reply via email to