Re: RFC: vfio / iommu driver for hardware with no iommu

Alex Williamson Tue, 23 Apr 2013 12:47:17 -0700

On Tue, 2013-04-23 at 19:16 +0000, Yoder Stuart-B08248 wrote:
> 
> > -----Original Message-----
> > From: Alex Williamson [mailto:alex.william...@redhat.com]
> > Sent: Tuesday, April 23, 2013 11:56 AM
> > To: Yoder Stuart-B08248
> > Cc: Joerg Roedel; iommu@lists.linux-foundation.org
> > Subject: Re: RFC: vfio / iommu driver for hardware with no iommu
> > 
> > On Tue, 2013-04-23 at 16:13 +0000, Yoder Stuart-B08248 wrote:
> > > Joerg/Alex,
> > >
> > > We have embedded systems where we use QEMU/KVM and have
> > > the requirement to do device assignment, but have no
> > > iommu.  So we would like to get vfio-pci working on
> > > systems like this.
> > >
> > > We're aware of the obvious limitations-- no protection,
> > > DMA'able memory must be physically contiguous and will
> > > have no iova->phy translation.  But there are use cases
> > > where all OSes involved are trusted and customers can
> > > live with those limitations.   Virtualization is used
> > > here not to sandbox untrusted code, but to consolidate
> > > multiple OSes.
> > >
> > > We would like to get your feedback on the rough idea.  There
> > > are two parts-- iommu driver and vfio-pci.
> > >
> > > 1.  iommu driver
> > >
> > > First, we still need device groups created because vfio
> > > is based on that, so we envision a 'dummy' iommu
> > > driver that implements only  the add/remove device
> > > ops.  Something like:
> > >
> > >     static struct iommu_ops fsl_none_ops = {
> > >             .add_device     = fsl_none_add_device,
> > >             .remove_device  = fsl_none_remove_device,
> > >     };
> > >
> > >     int fsl_iommu_none_init()
> > >     {
> > >             int ret = 0;
> > >
> > >             ret = iommu_init_mempool();
> > >             if (ret)
> > >                     return ret;
> > >
> > >             bus_set_iommu(&platform_bus_type, &fsl_none_ops);
> > >             bus_set_iommu(&pci_bus_type, &fsl_none_ops);
> > >
> > >             return ret;
> > >     }
> > >
> > > 2.  vfio-pci
> > >
> > > For vfio-pci, we would ideally like to keep user space mostly
> > > unchanged.  User space will have to follow the semantics
> > > of mapping only physically contiguous chunks...and iova
> > > will equal phys.
> > >
> > > So, we propose to implement a new vfio iommu type,
> > > called VFIO_TYPE_NONE_IOMMU.  This implements
> > > any needed vfio interfaces, but there are no calls
> > > to the iommu layer...e.g. map_dma() is a noop.
> > >
> > > Would like your feedback.
> > 
> > My first thought is that this really detracts from vfio and iommu groups
> > being a secure interface, so somehow this needs to be clearly an
> > insecure mode that requires an opt-in and maybe taints the kernel.  Any
> > notion of unprivileged use needs to be blocked and it should test
> > CAP_COMPROMISE_KERNEL (or whatever it's called now) at critical access
> > points.  We might even have interfaces exported that would allow this to
> > be an out-of-tree driver (worth a check).
> > 
> > I would guess that you would probably want to do all the iommu group
> > setup from the vfio fake-iommu driver.  In other words, that driver both
> > creates the fake groups and provides the dummy iommu backend for vfio.
> > That would be a nice way to compartmentalize this as a
> > vfio-noiommu-special.
> 
> So you mean don't implement any of the iommu driver
> ops at all and keep everything in the vfio layer?
> 
> Would you still have real iommu groups?...i.e. 
> $ readlink /sys/bus/pci/devices/0000:06:0d.0/iommu_group
> ../../../../kernel/iommu_groups/26
> 
> ...and that is created by vfio-noiommu-special?


I'm suggesting (but haven't checked if it's possible), to implement the
iommu driver ops as part of the vfio iommu backend driver.  The primary
motivation for this would be to a) keep a fake iommu groups interface
out of the iommu proper (possibly containing it in an external driver)
and b) modularizing it so we don't have fake iommu groups being created
by default.  It would have to populate the iommu groups sysfs interfaces
to be compatible with vfio.

> Right now when the PCI and platform buses are probed,
> the iommu driver add-device callback gets called and
> that is where the per-device group gets created.  Are
> you envisioning registering a callback for the PCI
> bus to do this in vfio-noiommu-special?

Yes.  It's just as easy to walk all the devices rather than doing
callbacks, iirc the group code does this when you register.  In fact,
this noiommu interface may not want to add all devices, we may want to
be very selective and only add some.

> > Would map/unmap really be no-ops?  Seems like you still want to do page
> > pinning.
> 
> You're right, that was a bad example...most would be no ops though.
> 
> > Also, you're using fsl in the example above, but would such a
> > driver have any platform dependency?
> 
> This wouldn't have to be fsl specific if we thought it was
> potentially generally useful.

Thanks,
Alex


_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: RFC: vfio / iommu driver for hardware with no iommu

Reply via email to