On Fri, Feb 23, 2018 at 2:09 AM, Peter Xu <pet...@redhat.com> wrote:
> On Fri, Feb 23, 2018 at 06:34:04AM +0000, Jintack Lim wrote:
>> On Fri, Feb 23, 2018 at 1:10 AM Peter Xu <pet...@redhat.com> wrote:
>>
>> > On Fri, Feb 23, 2018 at 12:32:13AM -0500, Jintack Lim wrote:
>> > > Hi Peter,
>> > >
>> > > Hope you had great holidays!
>> > >
>> > > On Thu, Feb 22, 2018 at 10:55 PM, Peter Xu <pet...@redhat.com> wrote:
>> > > > On Tue, Feb 20, 2018 at 11:03:46PM -0500, Jintack Lim wrote:
>> > > >> Hi,
>> > > >>
>> > > >> I'm using vhost with the virtual intel-iommu, and this page[1] shows
>> > > >> the QEMU command line example.
>> > > >>
>> > > >> qemu-system-x86_64 -M q35,accel=kvm,kernel-irqchip=split -m 2G \
>> > > >>                    -device intel-iommu,intremap=on,device-iotlb=on \
>> > > >>                    -device ioh3420,id=pcie.1,chassis=1 \
>> > > >>                    -device
>> > > >>
>> > virtio-net-pci,bus=pcie.1,netdev=net0,disable-legacy=on,disable-modern=off,iommu_platform=on,ats=on
>> > > >> \
>> > > >>                    -netdev tap,id=net0,vhostforce \
>> > > >>                    $IMAGE_PATH
>> > > >>
>> > > >> I wonder what's the impact of using device-iotlb and ats options as
>> > > >> they are described necessary.
>> > > >>
>> > > >> In my understanding, vhost in the kernel only looks at
>> > > >> VIRTIO_F_IOMMU_PLATFORM, and when it is set, vhost uses a
>> > > >> device-iotlb. In addition, vhost and QEMU communicate using vhost_msg
>> > > >> basically to cache mappings correctly in the vhost, so I wonder what's
>> > > >> the role of ats in this case.
>> > > >
>> > > > The "ats" as virtio device parameter will add ATS capability to the
>> > > > PCI device.
>> > > >
>> > > > The "device-iotlb" as intel-iommu parameter will enable ATS in the
>> > > > IOMMU device (and also report that in ACPI field).
>> > > >
>> > > > If both parameters are provided IIUC it means guest will know virtio
>> > > > device has device-iotlb and it'll treat the device specially (e.g.,
>> > > > guest will need to send device-iotlb invalidations).
>> > >
>> > > Oh, I see. I was focusing on how QEMU and vhost work in the host, but
>> > > I think I missed the guest part! Thanks. I see that the Intel IOMMU
>> > > driver has has_iotlb_device flag for that purpose.
>> > >
>> > > >
>> > > > We'd better keep these parameters when running virtio devices with
>> > > > vIOMMU.  For the rest of vhost/arm specific questions, I'll leave to
>> > > > others.
>> > >
>> > > It seems like SMMU is not checking ATS capability - at least
>> > > ats_enabled flag - but I may miss something here as well :)
>> > >
>> > > >
>> > > > PS: Though IIUC the whole ATS thing may not really be necessary for
>> > > > current VT-d emulation, since even with ATS vhost is registering UNMAP
>> > > > IOMMU notifiers (see vhost_iommu_region_add()), and IIUC that means
>> > > > vhost will receive IOTLB invalidations even without ATS support, and
>> > > > it _might_ still work.
>> > >
>> > > Right. That's what I thought.
>> > >
>> > > Come to think of it, I'm not sure why we need to flush mappings in
>> > > IOMMU and devices separately in the first place... Any thoughts?
>> >
>> > I don't know ATS much, neither.
>> >
>> > You can have a look at chap 4 of vt-d spec:
>> >
>> >         One approach to scaling IOTLBs is to enable I/O devices to
>> >         participate in the DMA remapping with IOTLBs implemented at
>> >         the devices. The Device-IOTLBs alleviate pressure for IOTLB
>> >         resources in the core logic, and provide opportunities for
>> >         devices to improve performance by pre-fetching address
>> >         translations before issuing DMA requests. This may be useful
>> >         for devices with strict DMA latency requirements (such as
>> >         isochronous devices), and for devices that have large DMA
>> >         working set or multiple active DMA streams.
>> >
>> > So I think it's for performance's sake. For example, the DMA operation
>> > won't need to be translated at all if it's pre-translated, so it can
>> > have less latency.  And also, that'll offload some of the translation
>> > process so that workload can be more distributed.
>> >
>> > When with that (caches located both on IOMMU's and device's side), we
>> > need to invalidate all the cache when needed.
>> >
>>
>> Right. I think my question was not clear. My question was that why don’t
>> IOMMU invalidate device-iotlb along with its mappings in one go. Then IOMMU
>> device driver doesn’t need to flush device-iotlb explicitly. Maybe the
>> reason is that ATS and IOMMU are not always coupled.. but I guess it’s time
>> for me to get some more background :)
>
> Ah, I see your point.
>
> I don't know the answer.  My wild guess is that IOMMU is just trying
> to be simple and only provide most basic functionalities, leaving
> complex stuff to CPU.  For example, if IOMMU takes over the ownership
> to deliever device-iotlb invalidations when receiving iotlb
> invalidations, it possibly needs to traverse the device tree sometimes
> (e.g., for domain invalidations) to know what device is under what
> domain, which is really compliated.  While it'll be simpler for CPU to
> do this since it's very possible that the OS keeps a list of devices
> for a domain already.
>
> IMHO that follows the *nix philosophy too - Do One Thing And Do It
> Well.  Though again, it's wild guess and I may be wrong. :)

Cool. Makes sense to me!

>
> CCing Alex, in case he has quick answers.
>
>>
>>
>> > >
>> > > Your reply was really helpful to me. I appreciate it.
>> >
>> > My pleasure.  Thanks,
>> >
>> > >
>> > > Thanks,
>> > > Jintack
>> > >
>> > > > But there can be other differences, like
>> > > > performance, etc.
>> > > >
>> > > >>
>> > > >> A related question is that if we use SMMU emulation[2] on ARM without
>> > > >> those options, does vhost cache mappings as if it has a device-iotlb?
>> > > >> (I guess this is the case.)
>> > > >>
>> > > >> I'm pretty new to QEMU code, so I might be missing something. Can
>> > > >> somebody shed some light on it?
>> > > >>
>> > > >> [1] https://wiki.qemu.org/Features/VT-d
>> > > >> [2]
>> > http://lists.nongnu.org/archive/html/qemu-devel/2018-02/msg04736.html
>> > > >>
>> > > >> Thanks,
>> > > >> Jintack
>> > > >>
>> > > >
>> > > > --
>> > > > Peter Xu
>> > > >
>> > >
>> >
>> > --
>> > Peter Xu
>> >
>> >
>
> --
> Peter Xu
>


Reply via email to