Re: [PATCH v8 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
On Fri, Apr 08, 2022 at 10:07:50AM -0600, Alex Williamson wrote: > On Fri, 8 Apr 2022 10:59:22 -0500 > Bjorn Helgaas wrote: > > > On Fri, Apr 08, 2022 at 05:37:16PM +0200, Joerg Roedel wrote: > > > On Fri, Apr 08, 2022 at 11:17:47AM -0300, Jason Gunthorpe wrote: > > > > You might consider using a linear tree instead of the topic branches, > > > > topics are tricky and I'm not sure it helps a small subsystem so much. > > > > Conflicts between topics are a PITA for everyone, and it makes > > > > handling conflicts with rc much harder than it needs to be. > > > > > > I like the concept of a branch per driver, because with that I can just > > > exclude that branch from my next-merge when there are issues with it. > > > Conflicts between branches happen too, but they are quite manageable > > > when the branches have the same base. > > > > FWIW, I use the same topic branch approach for PCI. I like the > > ability to squash in fixes or drop things without having to clutter > > the history with trivial commits and reverts. I haven't found > > conflicts to be a problem. > > Same. I think I've generally modeled my branch handling after Bjorn > and Joerg, I don't always use topic branches, but will for larger > contributions and I don't generally find conflicts to be a problem. > I'm always open to adopting best practices though. Thanks, I don't know about best practices, but I see most maintainers fall somewhere on a continuum between how Andrew Morton works and how David Miller/Linus work. Andrew's model is to try and send patches that are perfect and he manipulates his queue continually. It is never quite clear what will get merged until Linus actually merges it. The David/Linus model is that git is immutable and they only move forward. Never rebase, never manipulate an applied patch. Andrew has significantly reigned in how much he manipulates his queue - mostly due to pressure from Linus. Some of the remarks on this topic over the last year are pretty informative. So I would say changing patches once applied, or any rebasing, is now being seen as not best practice. (Indeed if Linus catches it and a mistake was made you are likely to get a sharp email) Why I made the note, is that at least in my flow, I would not trade two weeks of accepting patches for topics. I'll probably have 20-30 patches merged already before rc3 comes out. I never have problems merging rc's because when a rc conflicts with the next I have only one branch and can just merge the rc and resolve the conflict, or merge the rc before applying a patch that would create a conflict in the first place. Linus has given some guidance on when/how he prefers to see those merges done. Though I tend to advocate for a philosophy more like DaveM that the maintainer role is to hurry up and accept good patches - to emphasize flow as a way to build involvement and community. That is not necessarily an entirely appropriate approach in some of the more critical subsystems like mm/pci - if they are broken in a way that impacts a large number of people at rc1 then it can cause a lot of impact. For instance our QA team is always paniced if rc1 doesn't work on our test environments. Cheers, Jason ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] drm/tegra: Stop using iommu_present()
On 2022-04-07 18:51, Dmitry Osipenko wrote: On 4/6/22 21:06, Robin Murphy wrote: On 2022-04-06 15:32, Dmitry Osipenko wrote: On 4/5/22 17:19, Robin Murphy wrote: Remove the pointless check. host1x_drm_wants_iommu() cannot return true unless an IOMMU exists for the host1x platform device, which at the moment means the iommu_present() test could never fail. Signed-off-by: Robin Murphy --- drivers/gpu/drm/tegra/drm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c index 9464f522e257..bc4321561400 100644 --- a/drivers/gpu/drm/tegra/drm.c +++ b/drivers/gpu/drm/tegra/drm.c @@ -1149,7 +1149,7 @@ static int host1x_drm_probe(struct host1x_device *dev) goto put; } - if (host1x_drm_wants_iommu(dev) && iommu_present(&platform_bus_type)) { + if (host1x_drm_wants_iommu(dev)) { tegra->domain = iommu_domain_alloc(&platform_bus_type); if (!tegra->domain) { err = -ENOMEM; host1x_drm_wants_iommu() returns true if there is no IOMMU for the host1x platform device of Tegra20/30 SoCs. Ah, apparently this is another example of what happens when I write patches late on a Friday night... So on second look, what we want to ascertain here is whether dev has an IOMMU, but only if the host1x parent is not addressing-limited, either because it can also use the IOMMU, or because all possible addresses are small enough anyway, right? Yes Are we specifically looking for the host1x having a DMA-API-managed domain, or can that also end up using the tegra->domain or another unmanaged domain too? We have host1x DMA that could have: 1. No IOMMU domain, depending on kernel/DT config 2. Managed domain, on newer SoCs 3. Unmanaged domain, on older SoCs We have Tegra DRM devices which can: 1. Be attached to a shared unmanaged tegra->domain, on older SoCs 2. Have own managed domains, on newer SoCs I can't quite figure out from the comments whether it's physical addresses, IOVAs, or both that we're concerned with here. Tegra DRM allocates buffers and submits jobs to h/w using host1x's channel DMA. DRM framebuffers' addresses are inserted into host1x command buffers by kernel driver and addresses beyond 32bit space need to be treated specially, we don't support such addresses in upstream. IOMMU AS is limited to 32bits on Tegra in upstream kernel for pre-T186 SoCs, it hides 64bit addresses from host1x. Post-T186 SoCs have extra features that allow kernel driver not to bother about addresses. For newer ARM64 SoCs there is assumption in the Tegra drivers that IOMMU always presents, to simplify things. That summary helps a lot, thank you! I was particularly worried about the case where the host1x has a passthrough domain, which we'll assume is a DMA domain and leave in place, but if all the SoCs with the 32-bit gather limitation are also the ones with tegra-smmu, which doesn't support default domains anyway, then it sounds like that's a non-issue. I'll give this a bit more thought to make sure I really get it right, and send a v2 next week. Thanks, Robin. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 0/4] Make the iommu driver no-snoop block feature consistent
On 2022-04-08 14:35, Jason Gunthorpe wrote: On Fri, Apr 08, 2022 at 02:11:10PM +0100, Robin Murphy wrote: However, this creates an oddball situation where the vfio_device and it's struct device could become unplugged from the system while the domain that the struct device spawned continues to exist and remains attached to other devices in the same group. ie the iommu driver has to be careful not to retain the struct device input.. Oh, I rather assumed that VFIO might automatically tear down the container/domain when the last real user disappears. It does, that isn't quite what I mean.. Lets say a simple case with two groups and two devices. Open a VFIO container FD We open group A and SET_CONTAINER it. This results in an domain_A = iommu_domain_alloc(device_A) iommu_attach_group(domain_A, device_A->group) We open group B and SET_CONTAINER it. Using the sharing logic we end up doing iommu_attach_group(domain_A, device_B->group) Now we close group A FD, detatch device_A->group from domain_A and the driver core hot-unplugs device A completely from the system. However, domain_A remains in the system used by group B's open FD. It is a bit funny at least.. I think it is just something to document and be aware of for iommu driver writers that they probably shouldn't try to store the allocation device in their domain struct. IHMO the only purpose of the allocation device is to crystalize the configuration of the iommu_domain at allocation time. Oh, for sure. When I implement the API switch, I can certainly try to document it as clearly as possible that the device argument is only for resolving the correct IOMMU ops and target instance, and the resulting domain is still not in any way tied to that specific device. I hadn't thought about how it might look to future developers who aren't already familiar with all the history here, so thanks for the nudge! as long as we take care not to release DMA ownership until that point also. As you say, it just looks a bit funny. The DMA ownership should be OK as we take ownership on each group FD open I suppose that is inevitable to have sharing of domains across devices, so the iommu drivers will have to accommodate this. I think domain lifecycle management is already entirely up to the users and not something that IOMMU drivers need to worry about. Drivers should only need to look at per-device data in attach/detach (and, once I've finished, alloc) from the device argument which can be assumed to be valid at that point. Otherwise, all the relevant internal data for domain ops should belong to the domain already. Making attach/detach take a struct device would be nice - but I would expect the attach/detatch to use a strictly paired struct device and I don't think this trick of selecting an arbitary vfio_device will achieve that. So, I suppose VFIO would want to attach/detatch on every vfio_device individually and it would iterate over the group instead of doing a list_first_entry() like above. This would not be hard to do in VFIO. It feels like we've already beaten that discussion to death in other threads; regardless of what exact argument the iommu_attach/detach operations end up taking, they have to operate on the whole (explicit or implicit) iommu_group at once, because doing anything else would defeat the point of isolation groups, and be impossible for alias groups. Not sure what the iommu layer would have to do to accommodate this.. If it's significantly easier for VFIO to just run through a whole list of devices and attach each one without having to keep track of whether they might share an iommu_group which has already been attached, then we can probably relax the API a little such that attaching to a domain which is already the current domain becomes a no-op instead of returning -EBUSY, but I'd rather not create an expectation that anyone *has* to do that. For any other callers that would be forcing *more* iommu_group implementation details onto them, when we all want less. Cheers, Robin. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] RDMA/usnic: Refactor usnic_uiom_alloc_pd()
On Tue, Apr 05, 2022 at 03:35:59PM +0100, Robin Murphy wrote: > Rather than hard-coding pci_bus_type, pass the PF device through to > usnic_uiom_alloc_pd() and retrieve its bus there. This prepares for > iommu_domain_alloc() changing to take a device rather than a bus_type. > > Signed-off-by: Robin Murphy > Reported-by: kernel test robot > --- > drivers/infiniband/hw/usnic/usnic_ib_verbs.c | 3 +-- > drivers/infiniband/hw/usnic/usnic_uiom.c | 5 ++--- > drivers/infiniband/hw/usnic/usnic_uiom.h | 2 +- > 3 files changed, 4 insertions(+), 6 deletions(-) Applied to for-next, thanks Jason ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] RDMA/usnic: Stop using iommu_present()
On Tue, Apr 05, 2022 at 01:19:59PM +0100, Robin Murphy wrote: > Even if an IOMMU might be present for some PCI segment in the system, > that doesn't necessarily mean it provides translation for the device(s) > we care about. Replace iommu_present() with a more appropriate check at > probe time, and garbage-collect the resulting empty init function. > > Signed-off-by: Robin Murphy > --- > drivers/infiniband/hw/usnic/usnic_ib_main.c | 11 +-- > drivers/infiniband/hw/usnic/usnic_uiom.c| 10 -- > drivers/infiniband/hw/usnic/usnic_uiom.h| 1 - > 3 files changed, 5 insertions(+), 17 deletions(-) Applied to for-next, thanks Jason ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [git pull] IOMMU Fixes for Linux v5.18-rc1
The pull request you sent on Fri, 8 Apr 2022 13:52:06 +0200: > git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git > tags/iommu-fix-v5.18-rc1 has been merged into torvalds/linux.git: https://git.kernel.org/torvalds/c/02994fd2da76b99d3f6777f5b3bdb5d2823a0fed Thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/prtracker.html ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
On Fri, 8 Apr 2022 10:59:22 -0500 Bjorn Helgaas wrote: > On Fri, Apr 08, 2022 at 05:37:16PM +0200, Joerg Roedel wrote: > > On Fri, Apr 08, 2022 at 11:17:47AM -0300, Jason Gunthorpe wrote: > > > You might consider using a linear tree instead of the topic branches, > > > topics are tricky and I'm not sure it helps a small subsystem so much. > > > Conflicts between topics are a PITA for everyone, and it makes > > > handling conflicts with rc much harder than it needs to be. > > > > I like the concept of a branch per driver, because with that I can just > > exclude that branch from my next-merge when there are issues with it. > > Conflicts between branches happen too, but they are quite manageable > > when the branches have the same base. > > FWIW, I use the same topic branch approach for PCI. I like the > ability to squash in fixes or drop things without having to clutter > the history with trivial commits and reverts. I haven't found > conflicts to be a problem. Same. I think I've generally modeled my branch handling after Bjorn and Joerg, I don't always use topic branches, but will for larger contributions and I don't generally find conflicts to be a problem. I'm always open to adopting best practices though. Thanks, Alex ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
On Fri, Apr 08, 2022 at 05:37:16PM +0200, Joerg Roedel wrote: > On Fri, Apr 08, 2022 at 11:17:47AM -0300, Jason Gunthorpe wrote: > > You might consider using a linear tree instead of the topic branches, > > topics are tricky and I'm not sure it helps a small subsystem so much. > > Conflicts between topics are a PITA for everyone, and it makes > > handling conflicts with rc much harder than it needs to be. > > I like the concept of a branch per driver, because with that I can just > exclude that branch from my next-merge when there are issues with it. > Conflicts between branches happen too, but they are quite manageable > when the branches have the same base. FWIW, I use the same topic branch approach for PCI. I like the ability to squash in fixes or drop things without having to clutter the history with trivial commits and reverts. I haven't found conflicts to be a problem. Bjorn ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 4/4] vfio: Require that devices support DMA cache coherence
On Thu, 7 Apr 2022 12:23:47 -0300 Jason Gunthorpe wrote: > IOMMU_CACHE means that normal DMAs do not require any additional coherency > mechanism and is the basic uAPI that VFIO exposes to userspace. For > instance VFIO applications like DPDK will not work if additional coherency > operations are required. > > Therefore check IOMMU_CAP_CACHE_COHERENCY like vdpa & usnic do before > allowing an IOMMU backed VFIO device to be created. > > Signed-off-by: Jason Gunthorpe > --- > drivers/vfio/vfio.c | 7 +++ > 1 file changed, 7 insertions(+) > > diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c > index a4555014bd1e72..9edad767cfdad3 100644 > --- a/drivers/vfio/vfio.c > +++ b/drivers/vfio/vfio.c > @@ -815,6 +815,13 @@ static int __vfio_register_dev(struct vfio_device > *device, > > int vfio_register_group_dev(struct vfio_device *device) > { > + /* > + * VFIO always sets IOMMU_CACHE because we offer no way for userspace to > + * restore cache coherency. > + */ > + if (!iommu_capable(device->dev->bus, IOMMU_CAP_CACHE_COHERENCY)) > + return -EINVAL; > + > return __vfio_register_dev(device, > vfio_group_find_or_alloc(device->dev)); > } Acked-by: Alex Williamson ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 2/4] vfio: Move the Intel no-snoop control off of IOMMU_CACHE
On Thu, 7 Apr 2022 12:23:45 -0300 Jason Gunthorpe wrote: > IOMMU_CACHE means "normal DMA to this iommu_domain's IOVA should be cache > coherent" and is used by the DMA API. The definition allows for special > non-coherent DMA to exist - ie processing of the no-snoop flag in PCIe > TLPs - so long as this behavior is opt-in by the device driver. > > The flag is mainly used by the DMA API to synchronize the IOMMU setting > with the expected cache behavior of the DMA master. eg based on > dev_is_dma_coherent() in some case. > > For Intel IOMMU IOMMU_CACHE was redefined to mean 'force all DMA to be > cache coherent' which has the practical effect of causing the IOMMU to > ignore the no-snoop bit in a PCIe TLP. > > x86 platforms are always IOMMU_CACHE, so Intel should ignore this flag. > > Instead use the new domain op enforce_cache_coherency() which causes every > IOPTE created in the domain to have the no-snoop blocking behavior. > > Reconfigure VFIO to always use IOMMU_CACHE and call > enforce_cache_coherency() to operate the special Intel behavior. > > Remove the IOMMU_CACHE test from Intel IOMMU. > > Ultimately VFIO plumbs the result of enforce_cache_coherency() back into > the x86 platform code through kvm_arch_register_noncoherent_dma() which > controls if the WBINVD instruction is available in the guest. No other > arch implements kvm_arch_register_noncoherent_dma(). I think this last sentence is alluding to it, but I wish the user visible change to VFIO_DMA_CC_IOMMU on non-x86 were more explicit. Perhaps for the last sentence: No other archs implement kvm_arch_register_noncoherent_dma() nor are there any other known consumers of VFIO_DMA_CC_IOMMU that might be affected by the user visible result change on non-x86 archs. Otherwise, Acked-by: Alex Williamson > > Signed-off-by: Jason Gunthorpe > --- > drivers/iommu/intel/iommu.c | 7 ++- > drivers/vfio/vfio_iommu_type1.c | 30 +++--- > include/linux/intel-iommu.h | 1 - > 3 files changed, 21 insertions(+), 17 deletions(-) > > diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c > index f08611a6cc4799..8f3674e997df06 100644 > --- a/drivers/iommu/intel/iommu.c > +++ b/drivers/iommu/intel/iommu.c > @@ -641,7 +641,6 @@ static unsigned long domain_super_pgsize_bitmap(struct > dmar_domain *domain) > static void domain_update_iommu_cap(struct dmar_domain *domain) > { > domain_update_iommu_coherency(domain); > - domain->iommu_snooping = domain_update_iommu_snooping(NULL); > domain->iommu_superpage = domain_update_iommu_superpage(domain, NULL); > > /* > @@ -4283,7 +4282,6 @@ static int md_domain_init(struct dmar_domain *domain, > int guest_width) > domain->agaw = width_to_agaw(adjust_width); > > domain->iommu_coherency = false; > - domain->iommu_snooping = false; > domain->iommu_superpage = 0; > domain->max_addr = 0; > > @@ -4422,8 +4420,7 @@ static int intel_iommu_map(struct iommu_domain *domain, > prot |= DMA_PTE_READ; > if (iommu_prot & IOMMU_WRITE) > prot |= DMA_PTE_WRITE; > - if (((iommu_prot & IOMMU_CACHE) && dmar_domain->iommu_snooping) || > - dmar_domain->enforce_no_snoop) > + if (dmar_domain->enforce_no_snoop) > prot |= DMA_PTE_SNP; > > max_addr = iova + size; > @@ -4550,7 +4547,7 @@ static bool intel_iommu_enforce_cache_coherency(struct > iommu_domain *domain) > { > struct dmar_domain *dmar_domain = to_dmar_domain(domain); > > - if (!dmar_domain->iommu_snooping) > + if (!domain_update_iommu_snooping(NULL)) > return false; > dmar_domain->enforce_no_snoop = true; > return true; > diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c > index 9394aa9444c10c..c13b9290e35759 100644 > --- a/drivers/vfio/vfio_iommu_type1.c > +++ b/drivers/vfio/vfio_iommu_type1.c > @@ -84,8 +84,8 @@ struct vfio_domain { > struct iommu_domain *domain; > struct list_headnext; > struct list_headgroup_list; > - int prot; /* IOMMU_CACHE */ > - boolfgsp; /* Fine-grained super pages */ > + boolfgsp : 1; /* Fine-grained super pages */ > + boolenforce_cache_coherency : 1; > }; > > struct vfio_dma { > @@ -1461,7 +1461,7 @@ static int vfio_iommu_map(struct vfio_iommu *iommu, > dma_addr_t iova, > > list_for_each_entry(d, &iommu->domain_list, next) { > ret = iommu_map(d->domain, iova, (phys_addr_t)pfn << PAGE_SHIFT, > - npage << PAGE_SHIFT, prot | d->prot); > + npage << PAGE_SHIFT, prot | IOMMU_CACHE); > if (ret) > goto unwind; > > @@ -1771,7 +1771,7 @@ static int vfio_iommu_replay(struct vfio_iommu *iommu, > } > >
Re: [PATCH v8 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
On Fri, Apr 08, 2022 at 11:17:47AM -0300, Jason Gunthorpe wrote: > You might consider using a linear tree instead of the topic branches, > topics are tricky and I'm not sure it helps a small subsystem so much. > Conflicts between topics are a PITA for everyone, and it makes > handling conflicts with rc much harder than it needs to be. I like the concept of a branch per driver, because with that I can just exclude that branch from my next-merge when there are issues with it. Conflicts between branches happen too, but they are quite manageable when the branches have the same base. Overall I am thinking of reorganizing the IOMMU tree, but it will likely not end up to be a single-branch tree, although the number of patches per cycle _could_ just be carried in a single branch. > At least I haven't felt a need for topics while running larger trees, > and would find it stressful to try and squeeze the entire patch flow > into only 3 weeks out of the 7 week cycle. Yeah, so it is 4 weeks in an 9 weeks cycle :) The merge window is 2 weeks and not a lot happens. The 2 weeks after are for stabilization and I usually only pick up fixes. Then come the 4 weeks were new code gets into the tree. In the last week everything gets testing in linux-next to be ready for the merge window. I will pickup fixes in that week, of course. > In any event, I'd like this on a branch so Alex can pull it too, I > guess it means Alex has to merge rc3 to VFIO as well? Sure, I can put these patches in a separate branch for Alex to pull into the VFIO tree. Regards, Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
On Fri, Apr 08, 2022 at 04:00:31PM +0200, Joerg Roedel wrote: > On Fri, Apr 08, 2022 at 09:23:52AM -0300, Jason Gunthorpe wrote: > > Why rc3? It has been 4 weeks now with no futher comments. > > Because I start applying new code to branches based on -rc3. In the past > I used different -rc's for the topic branches (usually the latest -rc > available when I started applying to that branch), but that caused silly > merge conflicts from time to time. So I am now basing every topic branch > on the same -rc, which is usually -rc3. Rationale is that by -rc3 time > the kernel should have reasonably stabilized after the merge window. You might consider using a linear tree instead of the topic branches, topics are tricky and I'm not sure it helps a small subsystem so much. Conflicts between topics are a PITA for everyone, and it makes handling conflicts with rc much harder than it needs to be. At least I haven't felt a need for topics while running larger trees, and would find it stressful to try and squeeze the entire patch flow into only 3 weeks out of the 7 week cycle. In any event, I'd like this on a branch so Alex can pull it too, I guess it means Alex has to merge rc3 to VFIO as well? Thanks for explaining Jason ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
On Fri, Apr 08, 2022 at 09:23:52AM -0300, Jason Gunthorpe wrote: > Why rc3? It has been 4 weeks now with no futher comments. Because I start applying new code to branches based on -rc3. In the past I used different -rc's for the topic branches (usually the latest -rc available when I started applying to that branch), but that caused silly merge conflicts from time to time. So I am now basing every topic branch on the same -rc, which is usually -rc3. Rationale is that by -rc3 time the kernel should have reasonably stabilized after the merge window. Regards, Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 4/4] vfio: Require that devices support DMA cache coherence
On Fri, Apr 08, 2022 at 02:28:02PM +0100, Robin Murphy wrote: > > > One nit. Is it logistically more reasonable to put this patch before > > > changing VFIO to always set IOMMU_CACHE? > > > > For bisectability it has to be after > > > > iommu: Redefine IOMMU_CAP_CACHE_COHERENCY as the cap flag for > > IOMMU_CACHE > > > > Otherwise Intel iommu will stop working with VFIO > > > > The ordering is OK as is because no IOMMU that works with VFIO cares > > about IOMMU_CACHE. > > The Arm SMMU drivers do (without it even coherent traffic would be > downgraded to non-cacheable), but then they also handle > IOMMU_CAP_CACHE_COHERENCY nonsensically, and it happens to work out since > AFAIK there aren't (yet) any Arm-based systems where you can reasonably try > to use VFIO that don't also have hardware-coherent PCI. Thus I don't think > there's any risk of regression for us here. Right, I was unclear, I meant 'requires IOMMU_CACHE to be unset to work with VFIO' Jason ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 0/4] Make the iommu driver no-snoop block feature consistent
On Fri, Apr 08, 2022 at 02:11:10PM +0100, Robin Murphy wrote: > > However, this creates an oddball situation where the vfio_device and > > it's struct device could become unplugged from the system while the > > domain that the struct device spawned continues to exist and remains > > attached to other devices in the same group. ie the iommu driver has > > to be careful not to retain the struct device input.. > > Oh, I rather assumed that VFIO might automatically tear down the > container/domain when the last real user disappears. It does, that isn't quite what I mean.. Lets say a simple case with two groups and two devices. Open a VFIO container FD We open group A and SET_CONTAINER it. This results in an domain_A = iommu_domain_alloc(device_A) iommu_attach_group(domain_A, device_A->group) We open group B and SET_CONTAINER it. Using the sharing logic we end up doing iommu_attach_group(domain_A, device_B->group) Now we close group A FD, detatch device_A->group from domain_A and the driver core hot-unplugs device A completely from the system. However, domain_A remains in the system used by group B's open FD. It is a bit funny at least.. I think it is just something to document and be aware of for iommu driver writers that they probably shouldn't try to store the allocation device in their domain struct. IHMO the only purpose of the allocation device is to crystalize the configuration of the iommu_domain at allocation time. > as long as we take care not to release DMA ownership until that point also. > As you say, it just looks a bit funny. The DMA ownership should be OK as we take ownership on each group FD open > > I suppose that is inevitable to have sharing of domains across > > devices, so the iommu drivers will have to accommodate this. > > I think domain lifecycle management is already entirely up to the users and > not something that IOMMU drivers need to worry about. Drivers should only > need to look at per-device data in attach/detach (and, once I've finished, > alloc) from the device argument which can be assumed to be valid at that > point. Otherwise, all the relevant internal data for domain ops should > belong to the domain already. Making attach/detach take a struct device would be nice - but I would expect the attach/detatch to use a strictly paired struct device and I don't think this trick of selecting an arbitary vfio_device will achieve that. So, I suppose VFIO would want to attach/detatch on every vfio_device individually and it would iterate over the group instead of doing a list_first_entry() like above. This would not be hard to do in VFIO. Not sure what the iommu layer would have to do to accommodate this.. Jason ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 4/4] vfio: Require that devices support DMA cache coherence
On 2022-04-08 13:22, Jason Gunthorpe wrote: On Fri, Apr 08, 2022 at 08:26:10AM +, Tian, Kevin wrote: From: Jason Gunthorpe Sent: Thursday, April 7, 2022 11:24 PM IOMMU_CACHE means that normal DMAs do not require any additional coherency mechanism and is the basic uAPI that VFIO exposes to userspace. For instance VFIO applications like DPDK will not work if additional coherency operations are required. Therefore check IOMMU_CAP_CACHE_COHERENCY like vdpa & usnic do before allowing an IOMMU backed VFIO device to be created. Signed-off-by: Jason Gunthorpe drivers/vfio/vfio.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c index a4555014bd1e72..9edad767cfdad3 100644 +++ b/drivers/vfio/vfio.c @@ -815,6 +815,13 @@ static int __vfio_register_dev(struct vfio_device *device, int vfio_register_group_dev(struct vfio_device *device) { + /* +* VFIO always sets IOMMU_CACHE because we offer no way for userspace to +* restore cache coherency. +*/ + if (!iommu_capable(device->dev->bus, IOMMU_CAP_CACHE_COHERENCY)) + return -EINVAL; + One nit. Is it logistically more reasonable to put this patch before changing VFIO to always set IOMMU_CACHE? For bisectability it has to be after iommu: Redefine IOMMU_CAP_CACHE_COHERENCY as the cap flag for IOMMU_CACHE Otherwise Intel iommu will stop working with VFIO The ordering is OK as is because no IOMMU that works with VFIO cares about IOMMU_CACHE. The Arm SMMU drivers do (without it even coherent traffic would be downgraded to non-cacheable), but then they also handle IOMMU_CAP_CACHE_COHERENCY nonsensically, and it happens to work out since AFAIK there aren't (yet) any Arm-based systems where you can reasonably try to use VFIO that don't also have hardware-coherent PCI. Thus I don't think there's any risk of regression for us here. Robin. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 0/4] Make the iommu driver no-snoop block feature consistent
On 2022-04-08 13:18, Jason Gunthorpe wrote: On Thu, Apr 07, 2022 at 08:27:05PM +0100, Robin Murphy wrote: On 2022-04-07 20:08, Jason Gunthorpe wrote: On Thu, Apr 07, 2022 at 07:02:03PM +0100, Robin Murphy wrote: On 2022-04-07 18:43, Jason Gunthorpe wrote: On Thu, Apr 07, 2022 at 06:03:37PM +0100, Robin Murphy wrote: At a glance, this all looks about the right shape to me now, thanks! Thanks! Ideally I'd hope patch #4 could go straight to device_iommu_capable() from my Thunderbolt series, but we can figure that out in a couple of weeks once Yes, this does helps that because now the only iommu_capable call is in a context where a device is available :) Derp, of course I have *two* VFIO patches waiting, the other one touching the iommu_capable() calls (there's still IOMMU_CAP_INTR_REMAP, which, much as I hate it and would love to boot all that stuff over to drivers/irqchip, Oh me too... it's not in my way so I'm leaving it be for now). I'll have to rebase that anyway, so merging this as-is is absolutely fine! This might help your effort - after this series and this below there are no 'bus' users of iommu_capable left at all. Thanks, but I still need a device for the iommu_domain_alloc() as well, so at that point the interrupt check is OK to stay where it is. It is a simple enough change that could avoid introducing the device_iommu_capable() at all perhaps. I figured out a locking strategy per my original idea that seems pretty clean, it just needs vfio_group_viable() to go away first: I think this should be more like: struct vfio_device *vdev; mutex_lock(&group->device_lock); vdev = list_first_entry(group->device_list, struct vfio_device, group_next); ret = driver->ops->attach_group(data, group->iommu_group, group->type, vdev->dev); mutex_unlock(&group->device_lock); Then don't do the iommu_group_for_each_dev() at all. The problem with iommu_group_for_each_dev() is that it may select a struct device that does not have a vfio_device bound to it, so we would be using a random struct device that is not protected by any VFIO device_driver. Yeah, I was just looking at the final puzzle piece of making sure to nab the actual VFIO device rather than some unbound device that's just along for the ride... If I can't come up with anything more self-contained I'll take this suggestion, thanks. However, this creates an oddball situation where the vfio_device and it's struct device could become unplugged from the system while the domain that the struct device spawned continues to exist and remains attached to other devices in the same group. ie the iommu driver has to be careful not to retain the struct device input.. Oh, I rather assumed that VFIO might automatically tear down the container/domain when the last real user disappears. I don't see there being an obvious problem if the domain does hang around with residual non-VFIO devices attached until userspace explicitly closes all the relevant handles, as long as we take care not to release DMA ownership until that point also. As you say, it just looks a bit funny. I suppose that is inevitable to have sharing of domains across devices, so the iommu drivers will have to accommodate this. I think domain lifecycle management is already entirely up to the users and not something that IOMMU drivers need to worry about. Drivers should only need to look at per-device data in attach/detach (and, once I've finished, alloc) from the device argument which can be assumed to be valid at that point. Otherwise, all the relevant internal data for domain ops should belong to the domain already. Cheers, Robin. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
On Fri, Apr 08, 2022 at 08:22:35PM +0800, Lu Baolu wrote: > Hi Joerg, > > On 2022/4/8 15:57, Joerg Roedel wrote: > > On Mon, Mar 14, 2022 at 09:21:25PM -0300, Jason Gunthorpe wrote: > > > Joerg, are we good for the coming v5.18 merge window now? There are > > > several things backed up behind this series. > > > > I usually don't apply bigger changes like this after -rc7, so it didn't > > make it. Please re-send after -rc3 is out and I will consider it. > > Sure. I will do. Why rc3? It has been 4 weeks now with no futher comments. Jason ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 4/4] vfio: Require that devices support DMA cache coherence
On Fri, Apr 08, 2022 at 08:26:10AM +, Tian, Kevin wrote: > > From: Jason Gunthorpe > > Sent: Thursday, April 7, 2022 11:24 PM > > > > IOMMU_CACHE means that normal DMAs do not require any additional > > coherency > > mechanism and is the basic uAPI that VFIO exposes to userspace. For > > instance VFIO applications like DPDK will not work if additional coherency > > operations are required. > > > > Therefore check IOMMU_CAP_CACHE_COHERENCY like vdpa & usnic do > > before > > allowing an IOMMU backed VFIO device to be created. > > > > Signed-off-by: Jason Gunthorpe > > drivers/vfio/vfio.c | 7 +++ > > 1 file changed, 7 insertions(+) > > > > diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c > > index a4555014bd1e72..9edad767cfdad3 100644 > > +++ b/drivers/vfio/vfio.c > > @@ -815,6 +815,13 @@ static int __vfio_register_dev(struct vfio_device > > *device, > > > > int vfio_register_group_dev(struct vfio_device *device) > > { > > + /* > > +* VFIO always sets IOMMU_CACHE because we offer no way for > > userspace to > > +* restore cache coherency. > > +*/ > > + if (!iommu_capable(device->dev->bus, > > IOMMU_CAP_CACHE_COHERENCY)) > > + return -EINVAL; > > + > > One nit. Is it logistically more reasonable to put this patch before > changing VFIO to always set IOMMU_CACHE? For bisectability it has to be after iommu: Redefine IOMMU_CAP_CACHE_COHERENCY as the cap flag for IOMMU_CACHE Otherwise Intel iommu will stop working with VFIO The ordering is OK as is because no IOMMU that works with VFIO cares about IOMMU_CACHE. Jason ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
Hi Joerg, On 2022/4/8 15:57, Joerg Roedel wrote: On Mon, Mar 14, 2022 at 09:21:25PM -0300, Jason Gunthorpe wrote: Joerg, are we good for the coming v5.18 merge window now? There are several things backed up behind this series. I usually don't apply bigger changes like this after -rc7, so it didn't make it. Please re-send after -rc3 is out and I will consider it. Sure. I will do. Best regards, baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 3/4] iommu: Redefine IOMMU_CAP_CACHE_COHERENCY as the cap flag for IOMMU_CACHE
On Fri, Apr 08, 2022 at 08:21:55AM +, Tian, Kevin wrote: > (CC Jason Wang) > > > From: Jason Gunthorpe > > Sent: Thursday, April 7, 2022 11:24 PM > > > > While the comment was correct that this flag was intended to convey the > > block no-snoop support in the IOMMU, it has become widely implemented > > and > > used to mean the IOMMU supports IOMMU_CACHE as a map flag. Only the > > Intel > > driver was different. > > > > Now that the Intel driver is using enforce_cache_coherency() update the > > comment to make it clear that IOMMU_CAP_CACHE_COHERENCY is only > > about > > IOMMU_CACHE. Fix the Intel driver to return true since IOMMU_CACHE > > always > > works. > > > > The two places that test this flag, usnic and vdpa, are both assigning > > userspace pages to a driver controlled iommu_domain and require > > IOMMU_CACHE behavior as they offer no way for userspace to synchronize > > caches. > > > > Signed-off-by: Jason Gunthorpe > > Reviewed-by: Kevin Tian > > btw the comment about vsnic and vdpa matches my thought. But > a recent change in Qemu [1] possibly wants confirmation from > Jason Wang. > > [1] https://lore.kernel.org/all/20220304133556.233983-20-...@redhat.com/ That patch seems to have run into the confusion this series is addressing. VFIO_DMA_CC_IOMMU and snoop control is absolutely not needed by VDPA. We expect the VDPA kernel driver to be well behaved and not cause its device to generate no-snoop TLPs. VDPA needs IOMMU_CACHE only. Jason ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 0/4] Make the iommu driver no-snoop block feature consistent
On Thu, Apr 07, 2022 at 08:27:05PM +0100, Robin Murphy wrote: > On 2022-04-07 20:08, Jason Gunthorpe wrote: > > On Thu, Apr 07, 2022 at 07:02:03PM +0100, Robin Murphy wrote: > > > On 2022-04-07 18:43, Jason Gunthorpe wrote: > > > > On Thu, Apr 07, 2022 at 06:03:37PM +0100, Robin Murphy wrote: > > > > > At a glance, this all looks about the right shape to me now, thanks! > > > > > > > > Thanks! > > > > > > > > > Ideally I'd hope patch #4 could go straight to device_iommu_capable() > > > > > from > > > > > my Thunderbolt series, but we can figure that out in a couple of > > > > > weeks once > > > > > > > > Yes, this does helps that because now the only iommu_capable call is > > > > in a context where a device is available :) > > > > > > Derp, of course I have *two* VFIO patches waiting, the other one touching > > > the iommu_capable() calls (there's still IOMMU_CAP_INTR_REMAP, which, much > > > as I hate it and would love to boot all that stuff over to > > > drivers/irqchip, > > > > Oh me too... > > > > > it's not in my way so I'm leaving it be for now). I'll have to rebase that > > > anyway, so merging this as-is is absolutely fine! > > > > This might help your effort - after this series and this below there > > are no 'bus' users of iommu_capable left at all. > > Thanks, but I still need a device for the iommu_domain_alloc() as well, so > at that point the interrupt check is OK to stay where it is. It is a simple enough change that could avoid introducing the device_iommu_capable() at all perhaps. > I figured out a locking strategy per my original idea that seems > pretty clean, it just needs vfio_group_viable() to go away first: I think this should be more like: struct vfio_device *vdev; mutex_lock(&group->device_lock); vdev = list_first_entry(group->device_list, struct vfio_device, group_next); ret = driver->ops->attach_group(data, group->iommu_group, group->type, vdev->dev); mutex_unlock(&group->device_lock); Then don't do the iommu_group_for_each_dev() at all. The problem with iommu_group_for_each_dev() is that it may select a struct device that does not have a vfio_device bound to it, so we would be using a random struct device that is not protected by any VFIO device_driver. However, this creates an oddball situation where the vfio_device and it's struct device could become unplugged from the system while the domain that the struct device spawned continues to exist and remains attached to other devices in the same group. ie the iommu driver has to be careful not to retain the struct device input.. I suppose that is inevitable to have sharing of domains across devices, so the iommu drivers will have to accommodate this. Jason ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[git pull] IOMMU Fixes for Linux v5.18-rc1
Hi Linus, The following changes since commit 3123109284176b1532874591f7c81f3837bbdc17: Linux 5.18-rc1 (2022-04-03 14:08:21 -0700) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git tags/iommu-fix-v5.18-rc1 for you to fetch changes up to 71ff461c3f41f6465434b9e980c01782763e7ad8: iommu/omap: Fix regression in probe for NULL pointer dereference (2022-04-08 11:16:29 +0200) IOMMU Fix for Linux v5.18-rc1: - Fix boot regression due to a NULL-ptr dereference on OMAP machines Tony Lindgren (1): iommu/omap: Fix regression in probe for NULL pointer dereference drivers/iommu/omap-iommu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Please pull. Thanks, Joerg signature.asc Description: Digital signature ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 0/4] Make the iommu driver no-snoop block feature consistent
On 2022-04-08 10:08, Tian, Kevin wrote: From: Jason Gunthorpe Sent: Friday, April 8, 2022 3:08 AM On Thu, Apr 07, 2022 at 07:02:03PM +0100, Robin Murphy wrote: On 2022-04-07 18:43, Jason Gunthorpe wrote: On Thu, Apr 07, 2022 at 06:03:37PM +0100, Robin Murphy wrote: At a glance, this all looks about the right shape to me now, thanks! Thanks! Ideally I'd hope patch #4 could go straight to device_iommu_capable() from my Thunderbolt series, but we can figure that out in a couple of weeks once Yes, this does helps that because now the only iommu_capable call is in a context where a device is available :) Derp, of course I have *two* VFIO patches waiting, the other one touching the iommu_capable() calls (there's still IOMMU_CAP_INTR_REMAP, which, much as I hate it and would love to boot all that stuff over to drivers/irqchip, Oh me too... it's not in my way so I'm leaving it be for now). I'll have to rebase that anyway, so merging this as-is is absolutely fine! This might help your effort - after this series and this below there are no 'bus' users of iommu_capable left at all. Out of curiosity, while iommu_capable is being moved to a per-device interface what about irq_domain_check_msi_remap() below (which is also a global check)? I suppose it could if anyone cared enough to make the effort - probably a case of resolving specific MSI domains for every device in the group, and potentially having to deal with hotplug later as well. Realistically, though, I wouldn't expect systems to have mixed capabilities in that regard (i.e. where the check would return false even though *some* domains support remapping), so there doesn't seem to be any pressing need to relax it. Cheers, Robin. +static int vfio_iommu_device_ok(void *iommu_data, struct device *device) +{ + bool msi_remap; + + msi_remap = irq_domain_check_msi_remap() || + iommu_capable(device->bus, IOMMU_CAP_INTR_REMAP); + ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/omap: Fix regression in probe for NULL pointer dereference
Hi, * Joerg Roedel [220408 08:23]: > On Thu, Apr 07, 2022 at 08:39:05AM +0300, Tony Lindgren wrote: > > Can you guys please get this fix into the -rc series? Or ack it so > > I can pick it up into my fixes branch? > > Sorry for the delay, Covid catched me so I was away from email for > almost 2 week. This patch is picked-up now and I will send it upstream > for -rc2. OK welcome back then, and hopefully no serious symptoms. Thanks for picking up the patch. Regards, Tony ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v2 0/4] Make the iommu driver no-snoop block feature consistent
> From: Jason Gunthorpe > Sent: Friday, April 8, 2022 3:08 AM > On Thu, Apr 07, 2022 at 07:02:03PM +0100, Robin Murphy wrote: > > On 2022-04-07 18:43, Jason Gunthorpe wrote: > > > On Thu, Apr 07, 2022 at 06:03:37PM +0100, Robin Murphy wrote: > > > > At a glance, this all looks about the right shape to me now, thanks! > > > > > > Thanks! > > > > > > > Ideally I'd hope patch #4 could go straight to device_iommu_capable() > from > > > > my Thunderbolt series, but we can figure that out in a couple of weeks > once > > > > > > Yes, this does helps that because now the only iommu_capable call is > > > in a context where a device is available :) > > > > Derp, of course I have *two* VFIO patches waiting, the other one touching > > the iommu_capable() calls (there's still IOMMU_CAP_INTR_REMAP, which, > much > > as I hate it and would love to boot all that stuff over to > > drivers/irqchip, > > Oh me too... > > > it's not in my way so I'm leaving it be for now). I'll have to rebase that > > anyway, so merging this as-is is absolutely fine! > > This might help your effort - after this series and this below there > are no 'bus' users of iommu_capable left at all. > Out of curiosity, while iommu_capable is being moved to a per-device interface what about irq_domain_check_msi_remap() below (which is also a global check)? > +static int vfio_iommu_device_ok(void *iommu_data, struct device *device) > +{ > + bool msi_remap; > + > + msi_remap = irq_domain_check_msi_remap() || > + iommu_capable(device->bus, IOMMU_CAP_INTR_REMAP); > + ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v2 1/4] iommu: Introduce the domain op enforce_cache_coherency()
> From: Tian, Kevin > Sent: Friday, April 8, 2022 4:06 PM > > > From: Jason Gunthorpe > > Sent: Thursday, April 7, 2022 11:24 PM > > > > This new mechanism will replace using IOMMU_CAP_CACHE_COHERENCY > > and > > IOMMU_CACHE to control the no-snoop blocking behavior of the IOMMU. > > > > Currently only Intel and AMD IOMMUs are known to support this > > feature. They both implement it as an IOPTE bit, that when set, will cause > > PCIe TLPs to that IOVA with the no-snoop bit set to be treated as though > > the no-snoop bit was clear. > > > > The new API is triggered by calling enforce_cache_coherency() before > > mapping any IOVA to the domain which globally switches on no-snoop > > blocking. This allows other implementations that might block no-snoop > > globally and outside the IOPTE - AMD also documents such a HW capability. > > > > Leave AMD out of sync with Intel and have it block no-snoop even for > > in-kernel users. This can be trivially resolved in a follow up patch. > > > > Only VFIO will call this new API. > > I still didn't see the point of mandating a caller for a new API (and as > you pointed out iommufd will call it too). > > [...] > > diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h > > index 2f9891cb3d0014..1f930c0c225d94 100644 > > --- a/include/linux/intel-iommu.h > > +++ b/include/linux/intel-iommu.h > > @@ -540,6 +540,7 @@ struct dmar_domain { > > u8 has_iotlb_device: 1; > > u8 iommu_coherency: 1; /* indicate coherency of > > iommu access */ > > u8 iommu_snooping: 1; /* indicate snooping control > > feature */ > > + u8 enforce_no_snoop : 1;/* Create IOPTEs with snoop control */ > > it reads like no_snoop is the result of the enforcement... Probably > force_snooping better matches the intention here. > Above are just nits. Here is: Reviewed-by: Kevin Tian ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v2 4/4] vfio: Require that devices support DMA cache coherence
> From: Jason Gunthorpe > Sent: Thursday, April 7, 2022 11:24 PM > > IOMMU_CACHE means that normal DMAs do not require any additional > coherency > mechanism and is the basic uAPI that VFIO exposes to userspace. For > instance VFIO applications like DPDK will not work if additional coherency > operations are required. > > Therefore check IOMMU_CAP_CACHE_COHERENCY like vdpa & usnic do > before > allowing an IOMMU backed VFIO device to be created. > > Signed-off-by: Jason Gunthorpe > --- > drivers/vfio/vfio.c | 7 +++ > 1 file changed, 7 insertions(+) > > diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c > index a4555014bd1e72..9edad767cfdad3 100644 > --- a/drivers/vfio/vfio.c > +++ b/drivers/vfio/vfio.c > @@ -815,6 +815,13 @@ static int __vfio_register_dev(struct vfio_device > *device, > > int vfio_register_group_dev(struct vfio_device *device) > { > + /* > + * VFIO always sets IOMMU_CACHE because we offer no way for > userspace to > + * restore cache coherency. > + */ > + if (!iommu_capable(device->dev->bus, > IOMMU_CAP_CACHE_COHERENCY)) > + return -EINVAL; > + One nit. Is it logistically more reasonable to put this patch before changing VFIO to always set IOMMU_CACHE? otherwise, Reviewed-by: Kevin Tian ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/omap: Fix regression in probe for NULL pointer dereference
On Thu, Apr 07, 2022 at 08:39:05AM +0300, Tony Lindgren wrote: > Can you guys please get this fix into the -rc series? Or ack it so > I can pick it up into my fixes branch? Sorry for the delay, Covid catched me so I was away from email for almost 2 week. This patch is picked-up now and I will send it upstream for -rc2. Thanks, Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v2 3/4] iommu: Redefine IOMMU_CAP_CACHE_COHERENCY as the cap flag for IOMMU_CACHE
(CC Jason Wang) > From: Jason Gunthorpe > Sent: Thursday, April 7, 2022 11:24 PM > > While the comment was correct that this flag was intended to convey the > block no-snoop support in the IOMMU, it has become widely implemented > and > used to mean the IOMMU supports IOMMU_CACHE as a map flag. Only the > Intel > driver was different. > > Now that the Intel driver is using enforce_cache_coherency() update the > comment to make it clear that IOMMU_CAP_CACHE_COHERENCY is only > about > IOMMU_CACHE. Fix the Intel driver to return true since IOMMU_CACHE > always > works. > > The two places that test this flag, usnic and vdpa, are both assigning > userspace pages to a driver controlled iommu_domain and require > IOMMU_CACHE behavior as they offer no way for userspace to synchronize > caches. > > Signed-off-by: Jason Gunthorpe Reviewed-by: Kevin Tian btw the comment about vsnic and vdpa matches my thought. But a recent change in Qemu [1] possibly wants confirmation from Jason Wang. [1] https://lore.kernel.org/all/20220304133556.233983-20-...@redhat.com/ Thanks Kevin ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v2 2/4] vfio: Move the Intel no-snoop control off of IOMMU_CACHE
> From: Jason Gunthorpe > Sent: Thursday, April 7, 2022 11:24 PM > > IOMMU_CACHE means "normal DMA to this iommu_domain's IOVA should > be cache > coherent" and is used by the DMA API. The definition allows for special > non-coherent DMA to exist - ie processing of the no-snoop flag in PCIe > TLPs - so long as this behavior is opt-in by the device driver. > > The flag is mainly used by the DMA API to synchronize the IOMMU setting > with the expected cache behavior of the DMA master. eg based on > dev_is_dma_coherent() in some case. > > For Intel IOMMU IOMMU_CACHE was redefined to mean 'force all DMA to > be > cache coherent' which has the practical effect of causing the IOMMU to > ignore the no-snoop bit in a PCIe TLP. > > x86 platforms are always IOMMU_CACHE, so Intel should ignore this flag. > > Instead use the new domain op enforce_cache_coherency() which causes > every > IOPTE created in the domain to have the no-snoop blocking behavior. > > Reconfigure VFIO to always use IOMMU_CACHE and call > enforce_cache_coherency() to operate the special Intel behavior. > > Remove the IOMMU_CACHE test from Intel IOMMU. > > Ultimately VFIO plumbs the result of enforce_cache_coherency() back into > the x86 platform code through kvm_arch_register_noncoherent_dma() > which > controls if the WBINVD instruction is available in the guest. No other > arch implements kvm_arch_register_noncoherent_dma(). > > Signed-off-by: Jason Gunthorpe Reviewed-by: Kevin Tian btw as discussed in last version it is not necessarily to recalculate snoop control globally with this new approach. Will follow up to clean it up after this series is merged. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v2 1/4] iommu: Introduce the domain op enforce_cache_coherency()
> From: Jason Gunthorpe > Sent: Thursday, April 7, 2022 11:24 PM > > This new mechanism will replace using IOMMU_CAP_CACHE_COHERENCY > and > IOMMU_CACHE to control the no-snoop blocking behavior of the IOMMU. > > Currently only Intel and AMD IOMMUs are known to support this > feature. They both implement it as an IOPTE bit, that when set, will cause > PCIe TLPs to that IOVA with the no-snoop bit set to be treated as though > the no-snoop bit was clear. > > The new API is triggered by calling enforce_cache_coherency() before > mapping any IOVA to the domain which globally switches on no-snoop > blocking. This allows other implementations that might block no-snoop > globally and outside the IOPTE - AMD also documents such a HW capability. > > Leave AMD out of sync with Intel and have it block no-snoop even for > in-kernel users. This can be trivially resolved in a follow up patch. > > Only VFIO will call this new API. I still didn't see the point of mandating a caller for a new API (and as you pointed out iommufd will call it too). [...] > diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h > index 2f9891cb3d0014..1f930c0c225d94 100644 > --- a/include/linux/intel-iommu.h > +++ b/include/linux/intel-iommu.h > @@ -540,6 +540,7 @@ struct dmar_domain { > u8 has_iotlb_device: 1; > u8 iommu_coherency: 1; /* indicate coherency of > iommu access */ > u8 iommu_snooping: 1; /* indicate snooping control > feature */ > + u8 enforce_no_snoop : 1;/* Create IOPTEs with snoop control */ it reads like no_snoop is the result of the enforcement... Probably force_snooping better matches the intention here. Thanks Kevin ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
On Mon, Mar 14, 2022 at 09:21:25PM -0300, Jason Gunthorpe wrote: > Joerg, are we good for the coming v5.18 merge window now? There are > several things backed up behind this series. I usually don't apply bigger changes like this after -rc7, so it didn't make it. Please re-send after -rc3 is out and I will consider it. Thanks, Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH net-next] sfc: Stop using iommu_present()
Hello: This patch was applied to netdev/net-next.git (master) by Jakub Kicinski : On Tue, 5 Apr 2022 14:40:55 +0100 you wrote: > Even if an IOMMU might be present for some PCI segment in the system, > that doesn't necessarily mean it provides translation for the device > we care about. It appears that what we care about here is specifically > whether DMA mapping ops involve any IOMMU overhead or not, so check for > translation actually being active for our device. > > Signed-off-by: Robin Murphy > > [...] Here is the summary with links: - [net-next] sfc: Stop using iommu_present() https://git.kernel.org/netdev/net-next/c/6a62924c0a81 You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] drm/rockchip: Refactor IOMMU initialisation
On Tue, Apr 05, 2022 at 03:32:50PM +0100, Robin Murphy wrote: > Defer the IOMMU domain setup until after successfully binding > components, so we can figure out IOMMU support directly from the VOP > devices themselves, rather than manually inferring it from the DT (which > also fails to account for whether the IOMMU driver is actually loaded). > Although this is somewhat of a logical cleanup, the main motivation is > to prepare for a change in the iommu_domain_alloc() interface. > > Signed-off-by: Robin Murphy > --- > > Lightly tested on RK3288. This does stand to collide with the in-flight > VOP2 driver a little, if only that that will want an equivalent > rockchip_drm_dma_init_device() call too be able to use its IOMMU. I'm > happy to help sort that out either way, it just depends on what we want > to merge first. I tested it with the VOP2 driver. It needed a little refactoring, I had to move the call to rockchip_drm_dma_attach_device() from vop2_bind() to vop2_enable(), but then it works as expected. Assuming that this patch goes through Heikos tree just like the VOP2 driver we can merge it first. I'll base the next VOP2 round on it. Sascha -- Pengutronix e.K. | | Steuerwalder Str. 21 | http://www.pengutronix.de/ | 31137 Hildesheim, Germany | Phone: +49-5121-206917-0| Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917- | ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu