On Fri, Nov 11, 2016 at 11:26:12AM +0800, Jason Wang wrote: > > > On 2016年11月11日 01:32, Michael S. Tsirkin wrote: > > On Fri, Nov 04, 2016 at 02:48:20PM +0800, Jason Wang wrote: > > > > > > On 2016年11月04日 03:49, Michael S. Tsirkin wrote: > > > > On Thu, Nov 03, 2016 at 05:27:19PM +0800, Jason Wang wrote: > > > > > > This patches enable the Address Translation Service support for > > > > > > virtio > > > > > > pci devices. This is needed for a guest visible Device IOTLB > > > > > > implementation and will be required by vhost device IOTLB API > > > > > > implementation for intel IOMMU. > > > > > > > > > > > > Cc: Michael S. Tsirkin<m...@redhat.com> > > > > > > Signed-off-by: Jason Wang<jasow...@redhat.com> > > > > I'd like to understand why do you think this is strictly required. > > > > Won't setting CM bit in the IOMMU do the trick. > > > ATS was chosen for performance. Since there're many problems for CM: > > > > > > - CM was slow (10%-20% slower on real hardware for things like netperf) > > > because of each transition between non-present and present mapping needs > > > an > > > explicit invalidation. It may slow down the whole VM. > > > - Without ATS/Device IOTLB, IOMMU becomes a bottleneck because of > > > contending > > > of IOTLB entries. (What we can do in this case is in fact userspace IOTLB > > > snooping, this could be done even without CM). > > > It was natural to think of ATS when designing interface between IOMMU and > > > device/remote IOTLBs. Do you see any drawbacks on ATS here? > > > > > > Thanks > > In fact at this point I'm confused. Any mapping needs to be programmed > > in the IOMMU. We need to implement this correctly. > > Once we do why do we need ATS? > > I think what you need is map/unmap notifiers that Aviv is working on. > > No? > > Let me clarify, device IOTLB API can work without ATS or CM. So there're > three ways to do: > > 1) without ATS or CM support, the function could be implemented through: > 1.1: asking for qemu help if there's an IOTLB miss in vhost > 1.2: snooping the userspace IOTLB invalidation (present to non-present > mapping) and update device IOTLB > > 2) with CM enabled, the only thing we can add is snooping the non-present to > present mapping and update the device IOTLB. This is not a requirement since > we still can get this through asking qemu's(1.2) help. > > 3) with ATS enabled, guest knows the existence of device IOTLB, and device > IOTLB entires needs to be flushed explicitly by guest. In this case there's > no need to snoop the ordinary IOTLB invalidation in 1.2. We just need to > snoop the device IOTLB specific invalidation request from guest. > > All the above 3 methods work very well, but let's have a look at performance > impact: > > - Method 1 (without CM or ATS), the performance is not the best since guest > does not know about the existence of remote IOTLB, this means the flush of > device IOTLB entry could not be done on demand. One example is some IOMMU > driver (e.g intel) tends to optimize the IOTLB invalidations by issuing a > global invalidation periodically. We need to flush the device IOTLB too in > this case. Thus we can notice some jitter (because of IOTLB miss). > > - Method 2 (with CM but without ATS) seems to be the worst case. It has not > only all problems above a but also a new one: each transition needs to > notify the device explicitly. Even if dpdk use static mappings, all other > devices in the VM use dynamic ones which slows down the whole the system. > According to the test, CM is about 10%-20% slower in real hardware. > > - Method 3 (ATS) can give the best performance, all the problems have gone > since guest can flush the device IOTLB entry on demand. It was defined by > spec and was designed to solve the issues just like what we meet here, and > was supported by modern IOMMUs. > > And what's even better, implementing ATS turns out less than 100 lines of > codes. And it was much more easier to be enabled on other IOMMU (AMD IOMMU > only needs 20 lines of codes). All other ways (I started and have codes for > method 1 for intel IOMMU) need lots of work specific to each kind of IOMMU.
method 1 is basically what Aviv implemented except you don't need map notifiers, only unmap. > > Consider so much advantages by just adding so small lines of codes. I don't > see why we don't need ATS (for the IOOMUs that supports it). > > Thanks I am concerned that not all IOMMUs and guests support ATS. > > > > > > > > Also, could you remind me pls - can guests just disable ATS? > > > > > > > > What happens then? > > > > > > > >