On Thu, 21 Jan 2021 12:05:36 +0100 David Hildenbrand <da...@redhat.com> wrote:
> vIOMMU support works already with RamDiscardMgr as long as guests only > map populated memory. Both, populated and discarded memory is mapped > into &address_space_memory, where vfio_get_xlat_addr() will find that > memory, to create the vfio mapping. > > Sane guests will never map discarded memory (e.g., unplugged memory > blocks in virtio-mem) into an IOMMU - or keep it mapped into an IOMMU while > memory is getting discarded. However, there are two cases where a malicious > guests could trigger pinning of more memory than intended. > > One case is easy to handle: the guest trying to map discarded memory > into an IOMMU. > > The other case is harder to handle: the guest keeping memory mapped in > the IOMMU while it is getting discarded. We would have to walk over all > mappings when discarding memory and identify if any mapping would be a > violation. Let's keep it simple for now and print a warning, indicating > that setting RLIMIT_MEMLOCK can mitigate such attacks. > > We have to take care of incoming migration: at the point the > IOMMUs get restored and start creating mappings in vfio, RamDiscardMgr > implementations might not be back up and running yet: let's add runstate > priorities to enforce the order when restoring. > > Cc: Paolo Bonzini <pbonz...@redhat.com> > Cc: "Michael S. Tsirkin" <m...@redhat.com> > Cc: Alex Williamson <alex.william...@redhat.com> > Cc: Dr. David Alan Gilbert <dgilb...@redhat.com> > Cc: Igor Mammedov <imamm...@redhat.com> > Cc: Pankaj Gupta <pankaj.gupta.li...@gmail.com> > Cc: Peter Xu <pet...@redhat.com> > Cc: Auger Eric <eric.au...@redhat.com> > Cc: Wei Yang <richard.weiy...@linux.alibaba.com> > Cc: teawater <teawat...@linux.alibaba.com> > Cc: Marek Kedzierski <mkedz...@redhat.com> > Signed-off-by: David Hildenbrand <da...@redhat.com> > --- > hw/vfio/common.c | 35 +++++++++++++++++++++++++++++++++++ Acked-by: Alex Williamson <alex.william...@redhat.com> Reviewed-by: Alex Williamson <alex.william...@redhat.com> > hw/virtio/virtio-mem.c | 1 + > include/migration/vmstate.h | 1 + > 3 files changed, 37 insertions(+) > > diff --git a/hw/vfio/common.c b/hw/vfio/common.c > index 166ec6ec62..15ecd05a4b 100644 > --- a/hw/vfio/common.c > +++ b/hw/vfio/common.c > @@ -36,6 +36,7 @@ > #include "qemu/range.h" > #include "sysemu/kvm.h" > #include "sysemu/reset.h" > +#include "sysemu/runstate.h" > #include "trace.h" > #include "qapi/error.h" > #include "migration/migration.h" > @@ -574,6 +575,40 @@ static bool vfio_get_xlat_addr(IOMMUTLBEntry *iotlb, > void **vaddr, > error_report("iommu map to non memory area %"HWADDR_PRIx"", > xlat); > return false; > + } else if (memory_region_has_ram_discard_mgr(mr)) { > + RamDiscardMgr *rdm = memory_region_get_ram_discard_mgr(mr); > + RamDiscardMgrClass *rdmc = RAM_DISCARD_MGR_GET_CLASS(rdm); > + > + /* > + * Malicious VMs can map memory into the IOMMU, which is expected > + * to remain discarded. vfio will pin all pages, populating memory. > + * Disallow that. vmstate priorities make sure any RamDiscardMgr were > + * already restored before IOMMUs are restored. > + */ > + if (!rdmc->is_populated(rdm, mr, xlat, len)) { > + error_report("iommu map to discarded memory (e.g., unplugged via" > + " virtio-mem): %"HWADDR_PRIx"", > + iotlb->translated_addr); > + return false; > + } > + > + /* > + * Malicious VMs might trigger discarding of IOMMU-mapped memory. The > + * pages will remain pinned inside vfio until unmapped, resulting in > a > + * higher memory consumption than expected. If memory would get > + * populated again later, there would be an inconsistency between > pages > + * pinned by vfio and pages seen by QEMU. This is the case until > + * unmapped from the IOMMU (e.g., during device reset). > + * > + * With malicious guests, we really only care about pinning more > memory > + * than expected. RLIMIT_MEMLOCK set for the user/process can never > be > + * exceeded and can be used to mitigate this problem. > + */ > + warn_report_once("Using vfio with vIOMMUs and coordinated discarding > of" > + " RAM (e.g., virtio-mem) works, however, malicious" > + " guests can trigger pinning of more memory than" > + " intended via an IOMMU. It's possible to mitigate " > + " by setting/adjusting RLIMIT_MEMLOCK."); > } > > /* > diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c > index 6200813bb8..f419a758f3 100644 > --- a/hw/virtio/virtio-mem.c > +++ b/hw/virtio/virtio-mem.c > @@ -871,6 +871,7 @@ static const VMStateDescription vmstate_virtio_mem_device > = { > .name = "virtio-mem-device", > .minimum_version_id = 1, > .version_id = 1, > + .priority = MIG_PRI_VIRTIO_MEM, > .post_load = virtio_mem_post_load, > .fields = (VMStateField[]) { > VMSTATE_WITH_TMP(VirtIOMEM, VirtIOMEMMigSanityChecks, > diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h > index 075ee80096..3bf58ff043 100644 > --- a/include/migration/vmstate.h > +++ b/include/migration/vmstate.h > @@ -153,6 +153,7 @@ typedef enum { > MIG_PRI_DEFAULT = 0, > MIG_PRI_IOMMU, /* Must happen before PCI devices */ > MIG_PRI_PCI_BUS, /* Must happen before IOMMU */ > + MIG_PRI_VIRTIO_MEM, /* Must happen before IOMMU */ > MIG_PRI_GICV3_ITS, /* Must happen before PCI devices */ > MIG_PRI_GICV3, /* Must happen before the ITS */ > MIG_PRI_MAX,