On Fri, Feb 23, 2018 at 2:09 AM, Peter Xu <pet...@redhat.com> wrote: > On Fri, Feb 23, 2018 at 06:34:04AM +0000, Jintack Lim wrote: >> On Fri, Feb 23, 2018 at 1:10 AM Peter Xu <pet...@redhat.com> wrote: >> >> > On Fri, Feb 23, 2018 at 12:32:13AM -0500, Jintack Lim wrote: >> > > Hi Peter, >> > > >> > > Hope you had great holidays! >> > > >> > > On Thu, Feb 22, 2018 at 10:55 PM, Peter Xu <pet...@redhat.com> wrote: >> > > > On Tue, Feb 20, 2018 at 11:03:46PM -0500, Jintack Lim wrote: >> > > >> Hi, >> > > >> >> > > >> I'm using vhost with the virtual intel-iommu, and this page[1] shows >> > > >> the QEMU command line example. >> > > >> >> > > >> qemu-system-x86_64 -M q35,accel=kvm,kernel-irqchip=split -m 2G \ >> > > >> -device intel-iommu,intremap=on,device-iotlb=on \ >> > > >> -device ioh3420,id=pcie.1,chassis=1 \ >> > > >> -device >> > > >> >> > virtio-net-pci,bus=pcie.1,netdev=net0,disable-legacy=on,disable-modern=off,iommu_platform=on,ats=on >> > > >> \ >> > > >> -netdev tap,id=net0,vhostforce \ >> > > >> $IMAGE_PATH >> > > >> >> > > >> I wonder what's the impact of using device-iotlb and ats options as >> > > >> they are described necessary. >> > > >> >> > > >> In my understanding, vhost in the kernel only looks at >> > > >> VIRTIO_F_IOMMU_PLATFORM, and when it is set, vhost uses a >> > > >> device-iotlb. In addition, vhost and QEMU communicate using vhost_msg >> > > >> basically to cache mappings correctly in the vhost, so I wonder what's >> > > >> the role of ats in this case. >> > > > >> > > > The "ats" as virtio device parameter will add ATS capability to the >> > > > PCI device. >> > > > >> > > > The "device-iotlb" as intel-iommu parameter will enable ATS in the >> > > > IOMMU device (and also report that in ACPI field). >> > > > >> > > > If both parameters are provided IIUC it means guest will know virtio >> > > > device has device-iotlb and it'll treat the device specially (e.g., >> > > > guest will need to send device-iotlb invalidations). >> > > >> > > Oh, I see. I was focusing on how QEMU and vhost work in the host, but >> > > I think I missed the guest part! Thanks. I see that the Intel IOMMU >> > > driver has has_iotlb_device flag for that purpose. >> > > >> > > > >> > > > We'd better keep these parameters when running virtio devices with >> > > > vIOMMU. For the rest of vhost/arm specific questions, I'll leave to >> > > > others. >> > > >> > > It seems like SMMU is not checking ATS capability - at least >> > > ats_enabled flag - but I may miss something here as well :) >> > > >> > > > >> > > > PS: Though IIUC the whole ATS thing may not really be necessary for >> > > > current VT-d emulation, since even with ATS vhost is registering UNMAP >> > > > IOMMU notifiers (see vhost_iommu_region_add()), and IIUC that means >> > > > vhost will receive IOTLB invalidations even without ATS support, and >> > > > it _might_ still work. >> > > >> > > Right. That's what I thought. >> > > >> > > Come to think of it, I'm not sure why we need to flush mappings in >> > > IOMMU and devices separately in the first place... Any thoughts? >> > >> > I don't know ATS much, neither. >> > >> > You can have a look at chap 4 of vt-d spec: >> > >> > One approach to scaling IOTLBs is to enable I/O devices to >> > participate in the DMA remapping with IOTLBs implemented at >> > the devices. The Device-IOTLBs alleviate pressure for IOTLB >> > resources in the core logic, and provide opportunities for >> > devices to improve performance by pre-fetching address >> > translations before issuing DMA requests. This may be useful >> > for devices with strict DMA latency requirements (such as >> > isochronous devices), and for devices that have large DMA >> > working set or multiple active DMA streams. >> > >> > So I think it's for performance's sake. For example, the DMA operation >> > won't need to be translated at all if it's pre-translated, so it can >> > have less latency. And also, that'll offload some of the translation >> > process so that workload can be more distributed. >> > >> > When with that (caches located both on IOMMU's and device's side), we >> > need to invalidate all the cache when needed. >> > >> >> Right. I think my question was not clear. My question was that why don’t >> IOMMU invalidate device-iotlb along with its mappings in one go. Then IOMMU >> device driver doesn’t need to flush device-iotlb explicitly. Maybe the >> reason is that ATS and IOMMU are not always coupled.. but I guess it’s time >> for me to get some more background :) > > Ah, I see your point. > > I don't know the answer. My wild guess is that IOMMU is just trying > to be simple and only provide most basic functionalities, leaving > complex stuff to CPU. For example, if IOMMU takes over the ownership > to deliever device-iotlb invalidations when receiving iotlb > invalidations, it possibly needs to traverse the device tree sometimes > (e.g., for domain invalidations) to know what device is under what > domain, which is really compliated. While it'll be simpler for CPU to > do this since it's very possible that the OS keeps a list of devices > for a domain already. > > IMHO that follows the *nix philosophy too - Do One Thing And Do It > Well. Though again, it's wild guess and I may be wrong. :)
Cool. Makes sense to me! > > CCing Alex, in case he has quick answers. > >> >> >> > > >> > > Your reply was really helpful to me. I appreciate it. >> > >> > My pleasure. Thanks, >> > >> > > >> > > Thanks, >> > > Jintack >> > > >> > > > But there can be other differences, like >> > > > performance, etc. >> > > > >> > > >> >> > > >> A related question is that if we use SMMU emulation[2] on ARM without >> > > >> those options, does vhost cache mappings as if it has a device-iotlb? >> > > >> (I guess this is the case.) >> > > >> >> > > >> I'm pretty new to QEMU code, so I might be missing something. Can >> > > >> somebody shed some light on it? >> > > >> >> > > >> [1] https://wiki.qemu.org/Features/VT-d >> > > >> [2] >> > http://lists.nongnu.org/archive/html/qemu-devel/2018-02/msg04736.html >> > > >> >> > > >> Thanks, >> > > >> Jintack >> > > >> >> > > > >> > > > -- >> > > > Peter Xu >> > > > >> > > >> > >> > -- >> > Peter Xu >> > >> > > > -- > Peter Xu >